Tb diagonostic based on antigens from m. tuberculosis

ABSTRACT

The present invention is based on the identification and characterization of a number of novel  M. tuberculosis  derived proteins and protein fragments. The invention is directed to the polypeptides and immunologically active fragments thereof, the genes encoding them, immunological compositions such as diagnostic reagents containing the polypeptides.

This application is a continuation of U.S. application Ser. No. 11/196,018, filed on Aug. 2, 2005, which is a divisional of:

-   -   U.S. application Ser. No. 10/138,473, filed on May 2, 2002, now         U.S. Pat. No. 6,982,085, which is a continuation-in-part of U.S.         patent application Ser. No. 10/060,428, filed Jan. 29, 2002, now         abandoned, which is a continuation-in-part of U.S. application         Ser. No. 09/415,884, filed Oct. 8, 1999, now abandoned, which is         a non-provisional of U.S. Application No. 60/116,673, filed Jan.         21, 1999, which claims priority to     -   U.S. patent application Ser. No. 09/050,739, filed 30 Mar. 1998,         now U.S. Pat. No. 6,641,814, which claims priority from U.S.         Provisional Application No. 60/044,624, filed 18 Apr. 1997, U.S.         Provisional Application No. 60/070,488, filed 5 Jan. 1998, and         Danish Patent Applications Nos. DK 1997 00376, filed 2 Apr.         1997, and DK 1997 01277, filed 10 Nov. 1997;     -   U.S. patent application Ser. No. 09/791,171, filed 20 Feb. 2001,         now abandoned, which is a divisional of the above mentioned U.S.         patent application Ser. No. 09/050,739, claiming the same         priorities; and     -   U.S. patent application Ser. No. 09/415,884, filed 8 Oct. 1999,         now abandoned, which claims priority from U.S. Provisional         Application No. 60/116,673, filed 21 Jan. 1999 and Danish Patent         Application No. DK 1998 01281, filed 8 Oct. 1998,         -   each of which application is incorporated by reference in             its entirety.

Each of these patent applications as well as all documents cited in the text of this application, and references cited in the documents referred to in this application (including references cited in the aforementioned patent applications or during their prosecution) are hereby incorporated herein by reference and for which priority is claimed under 35 U.S.C. § 120.

FIELD OF INVENTION

The present invention relates to a number of immunologically active, novel polypeptide fragments derived from Mycobacterium tuberculosis, diagnostics and other immunologic compositions containing the fragments as immunogenic components, and methods of production and use of the polypeptides. The invention also relates to novel nucleic acid fragments derived from M. tuberculosis which are useful in the preparation of the polypeptide fragments of the invention or in the diagnosis of infection with M. tuberculosis. The invention further relates to certain fusion polypeptides.

GENERAL BACKGROUND

Human tuberculosis caused by Mycobacterium tuberculosis (M. tuberculosis) is a severe global health problem, responsible for approx. 3 million deaths annually, according to the WHO. The worldwide incidence of new tuberculosis (TB) cases had been falling during the 1960s and 1970s but during recent years this trend has markedly changed in part due to the advent of AIDS and the appearance of multidrug resistant strains of M. tuberculosis.

In 1998 Cole et al published the complete genome sequence of M. tuberculosis and predicted the presence of approximately 4000 open reading frames (Cole et al 1998). Nucleotide sequences are described, and putative protein sequences. However importantly, this sequence information cannot be used to predict if the DNA is translated and expressed as proteins in vivo. More importantly, it is not possible on the basis of the sequences, to predict whether a given sequence will encode an immunogenic or an inactive protein. The only way to determine if a protein is recognized by the immune system during or after an infection with M. tuberculosis is to produce the given protein and test it in an appropriate assay as described herein.

Short term-culture filtrate (ST-CF) is a complex mixture of proteins released from M. tuberculosis during the first few days of growth in a liquid medium (Andersen et al., 1991). Culture filtrates has been suggested to hold protective antigens recognized by the host in the first phase of TB infection (Andersen et al. 1991, Orme et al. 1993). Recent data from several laboratories have demonstrated that experimental subunit vaccines based on culture filtrate antigens can provide high levels of acquired resistance to TB (Pal and Horwitz, 1992; Roberts et al., 1995; Andersen, 1994; Lindblad et al., 1997). Culture filtrates are, however, complex protein mixtures and until now very limited information has been available on the molecules responsible for this protective immune response.

It is thus an object of the present invention to provide a composition for the determination of an immune response against a virulent Mycobacterium such as a diagnostic reagent for the diagnosis of an infection with a virulent Mycobacterium.

SUMMARY OF THE INVENTION

The present invention is i.a. based on the identification and characterization of a number of previously uncharacterized culture filtrate antigens from M. tuberculosis. In animal models of TB, T cells mediating immunity are focused predominantly to antigens in the regions 6-12 and 17-30 kDa of STCF. In the present invention 8 antigens in the low molecular weight region (CFP7, CFP7A, CFP7B, CFP8A, CFP8B, CFP9, CFP10A, and CFP11) and 18 antigens (CFP16, CFP17, CFP19, CFP19B, CFP20, CFP21, CFP22, CFP22A, CFP23, CFP23A, CFP23B, CFP25, CFP26, CFP27, CFP28, CFP29, CFP30A, and CFP30B) in the 17-30 kDa region have been identified.

Finally, the invention is based on the surprising discovery that fusions between ESAT-6 and MPT59 are superior immunogens compared to the unfused proteins, respectively.

The following table lists the antigens of the invention by the names used herein as well as by reference to relevant SEQ ID NOs of N-terminal sequences, full amino acid sequences and sequences of DNA encoding the antigens:

N-terminal Nucleotide Amino acid sequence sequence sequence Antigen SEQ ID NO: SEQ ID NO: SEQ ID NO: CFP7 1 2

It is well-known in the art that T-cell epitopes are responsible for the elicitation of the acquired immunity against TB, whereas B-cell epitopes are without any significant influence on acquired immunity and recognition of mycobacteria in vivo. Since such T-cell epitopes are linear and are known to have a minimum length of 6 amino acid residues, the present invention is especially concerned with the identification and utilisation of such T-cell epitopes.

Hence, in its broadest aspect the invention relates to a substantially pure polypeptide fragment which

a) comprises an amino acid sequence selected from the sequences shown in SEQ ID NO: 2, b) comprises a subsequence of the polypeptide fragment defined in a) which has a length of at least 6 amino acid residues, said subsequence being immunologically equivalent to the polypeptide defined in a) with respect to the ability of evoking a protective immune response against infections with mycobacteria belonging to the tuberculosis complex or with respect to the ability of eliciting a diagnostically significant immune response indicating previous or ongoing sensitization with antigens derived from mycobacteria belonging to the tuberculosis complex, or c) comprises an amino acid sequence having a sequence identity with the polypeptide defined in a) or the subsequence defined in b) of at least 70% and at the same time being immunologically equivalent to the polypeptide defined in a) with respect to the ability of evoking a protective immune response against infections with mycobacteria belonging to the tuberculosis complex or with respect to the ability of eliciting a diagnostically significant immune response indicating previous or ongoing sensitization with antigens derived from mycobacteria belonging to the tuberculosis complex, with the proviso that i) the polypeptide fragment is in essentially pure form when consisting of the amino acid sequence 1-96 of SEQ ID NO: 2, ii) the degree of sequence identity in c) is at least 95% when the polypeptide comprises a homologue of a polypeptide which has the amino acid sequence SEQ ID NO: 2 or a subsequence thereof as defined in b), and Other parts of the invention pertains to the DNA fragments encoding a polypeptide with the above definition as well as to DNA fragments useful for determining the presence of DNA encoding such polypeptides has a length of at least 10 nucleotides and hybridizes readily under stringent hybridization conditions.

It is surprisingly demonstrated herein that several polypeptides isolated from the cell wall, cell membrane or cytosol and short term culture filtrate (STCF) are recognized by human tuberculosis antisera.

Therefore it is considered likely that these polypeptides, either alone or in combination, can be useful as diagnostic reagents in the diagnosis of tuberculosis.

The present inventors contemplate that in order to achieve a very high sensitivity for a serodiagnostic TB reagent it is important to combine two or more TB antigens, or alternatively, to use recombinant fusions proteins comprising at least two proteins or B cell epitopes. The antibody response of tuberculosis is heterogeneous with considerable person-to-person variance to which antigens that are recognized by the antibodies and therefore it can be an advantage to use combinations of proteins (e.g. in protein cocktails) which may increase the sensitivity and be recognized by sera from a high proportion of infected individuals. Especially, it is advantageous to combine from two to four antigens which will give a higher sensitivity than the single antigen and still a high specificity (more than 90%).

Thus, the invention is related to detection of infections caused by species of the tuberculosis complex (M. tuberculosis, M. bovis, M. africanum) by the use of a combination of two or more polypeptides comprising a M. tuberculosis antigen or an immunogenic portion or other variant thereof, or by the use of two or more DNA sequences encoding a M. tuberculosis antigen or an immunogenic portion or other variant thereof.

DETAILED DISCLOSURE OF THE INVENTION

The invention relates to polypeptides which induce specific antibody responses in a TB patient as determined by an ELISA technique or a western blot when the whole blood is diluted 1:20 in PBS and stimulated with the polypeptide in a concentration of at the most 20 μg/ml and induces an OD of at least 0.1 in ELISA, or a visual response in western blot.

Any polypeptide fulfilling the above property and which is obtainable from either the cell wall, cell membrane, the cytosol or STCF of the tuberculosis complex is within the scope of the present invention.

In an important embodiment, the invention relates to a composition comprising a combination of two or more (e.g. 2, 3, 4, 5, 6, 7 or more) substantially pure polypeptides, which comprises one or more amino acid sequences selected from

-   (a) Rv0652, Rv2462c, Rv1984c, Rv2185c, Rv1636, Rv3451, Rv3872,     Rv3354 and Rv2623 -   (b) an immunogenic portion of any one of the sequences in (a);     and/or -   (c) an amino acid sequence analogue having at least 70% sequence     identity to any one of the sequences in (a) or (b) and at the same     time being immunogenic;     for use as a pharmaceutical or diagnostic reagent.

Also, the invention relates to a composition comprising one or more fusion polypeptides, which comprises one or more amino acid sequences selected from

-   (a) Rv0652, Rv2462c, Rv1984c, Rv2185c, Rv1636, Rv3451, Rv3872,     Rv3354 and Rv2623 -   (b) an immunogenic portion of any one of the sequences in (a);     and/or -   (c) an amino acid sequence analogue having at least 70% sequence     identity to any one of the sequences in (a) or (b) and at the same     time being immunogenic;     and at least one fusion partner.

The fusion partner comprises preferably a polypeptide fragment selected from

-   (a) a polypeptide fragment derived from a virulent mycobacterium,     such as ESAT-6, MPB64, MPT64, TB10.4, CFP10, RD1-ORF5, RD1-ORF2,     Rv1036, Ag85A, Ag85B, Ag85C, 19 kDa lipoprotein, MPT32, MPB59 and     alpha-crystallin; -   (b) a polypeptide as defined above; and/or -   (c) at least one immunogenic portion of any of such polypeptides     in (a) or (b).

In another embodiment, the invention relates to an immunogenic composition comprising a composition according to the invention.

In a further embodiment, the invention relates to the use of a composition as defined above for the preparation of a pharmaceutical composition, e.g. for diagnosis of tuberculosis caused by virulent mycobacteria, e.g. by Mycobacterium tuberculosis, Mycobacterium africanum or Mycobacterium bovis.

In a still further embodiment, the invention relates to a diagnostic tool comprising a combination of two or more substantially pure polypeptides, which comprises one or more amino acid sequences selected from

-   (a) Rv0652, Rv2462c, Rv1984c, Rv2185c, Rv1636, Rv3451, Rv3872,     Rv3354 and Rv2623 -   (b) an immunogenic portion of any one of the sequences in (a);     and/or -   (c) an amino acid sequence analogue having at least 70% sequence     identity to any one of the sequences in (a) or (b) and at the same     time being immunogenic

Also, the invention relates to a substantially pure polypeptide, which comprises an amino acid sequence selected from

(a) Rv0652, Rv2462c, Rv1984c, Rv2185c, Rv1636, Rv3451, Rv3872, Rv3354 and Rv2623

-   (b) an immunogenic portion of any one of the sequences in (a);     and/or -   (c) an amino acid sequence analogue having at least 70% sequence     identity to any one of the sequences in (a) or (b) and at the same     time being immunogenic     for use in preparing a composition according to the invention or a     diagnostic tool according to the invention.

The polypeptide fragments of the invention preferably comprises an amino acid sequence of at least 6 amino acid residues in length which has a higher sequence identity than 70 percent with SEQ ID NO: 2. A preferred minimum percentage of sequence identity is at least 80%, such as at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, and at least 99.5%.

In a further embodiment, the invention relates to a nucleic acid fragments in isolated form which

-   (a) comprises one or more nucleic acid sequences which encodes a     polypeptide as defined above, or comprises a nucleic acid sequence     complementary thereto; or -   (b) has a length of at least 10 nucleotides and hybridizes readily     under stringent hybridization conditions with a nucleotide sequence     selected from Rv0652, Rv2462c, Rv1984c, Rv2185c, Rv1636, Rv3451,     Rv3872, Rv3354 and Rv2623 nucleotide sequences or a sequence     complementary thereto, or with a nucleotide sequence selected from a     sequence in (a).

The nucleic acid fragment is preferably a DNA fragment.

It is preferred that the nucleic acid fragment is a DNA fragment.

It is preferred that the nucleic acid fragment is longer than 10 nucleotides, such as at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, at least 55, at least 60, at least 65, at least 70, and at least 80 nucleotides long, and the sequence identity should preferable also be higher than 70%, such as higher than 70%, such as at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 94%, at least 96%, and at least 98%. It is most preferred that the sequence identity is 100%.

In another embodiment, the invention relates to the use of a nucleic acid fragment according to the invention for the preparation of a composition for the diagnosis of tuberculosis caused by virulent mycobacteria, e.g. by Mycobacterium tuberculosis, Mycobacterium africanum or Mycobacterium bovis.

The invention also relates to a replicable expression vector, which comprises a nucleic acid, fragment according to the invention, and to a transformed cell harbouring at least one such vector.

In another embodiment, the invention relates to a method for producing a polypeptide according to the invention, comprising

-   (a) inserting a nucleic acid fragment according to the invention     into a vector which is able to replicate in a host cell, introducing     the resulting recombinant vector into the host cell, culturing the     host cell in a culture medium under conditions sufficient to effect     expression of the polypeptide, and recovering the polypeptide from     the host cell or culture medium; -   (b) isolating the polypeptide from a whole mycobacterium, e.g.     Mycobacterium tuberculosis, Mycobacterium africanum or Mycobacterium     bovis, from culture filtrate or from lysates or fractions thereof;     or -   (c) synthesizing the polypeptide e.g. by solid or liquid phase     peptide synthesis.

In a preferred embodiment of the invention, the polypeptide fragment of the invention comprises an epitope for a T-helper cell.

The invention also relates to a method of diagnosing tuberculosis caused by virulent mycobacteria, e.g. by Mycobacterium tuberculosis, Mycobacterium africanum or Mycobacterium bovis, in an animal, including a human being, comprising intradermally injecting, in the animal, a composition according to the invention, a positive skin response at the location of injection being indicative of the animal having tuberculosis, and a negative skin response at the location of injection being indicative of the animal not having tuberculosis.

A monoclonal or polyclonal antibody, which is specifically reacting with a polypeptide according to the invention in an immuno assay, or a specific binding fragment of said antibody for use as a diagnostic reagent, is also a part of the invention.

In line with the disclosure above pertaining to vaccine preparation and use, the invention also pertains to a method for immunising an animal, including a human being, against TB caused by mycobacteria belonging to the tuberculosis complex, comprising administering to the animal the polypeptide of the invention, or a vaccine composition of the invention as described above, or a living vaccine described above. Preferred routes of administration are the parenteral (such as intravenous and intraarterially), intraperitoneal, intramuscular, subcutaneous, intradermal, oral, buccal, sublingual, nasal, rectal or transdermal route.

the sample from the animal with the polypeptide of the invention, a significant release into the extracellular phase of at least one cytokine by mononuclear cells in the blood sample being indicative of the animal being sensitized. By the term “significant release” is herein meant that the release of the cytokine is significantly higher than the cytokine release

polypeptide fragments of the invention and using well-known methods for visualizing the reaction between the antibody and antigen.

In a further embodiment, the invention relates to a method for diagnosing previous or ongoing infection with a virulent mycobacterium, said method comprising

-   (a) contacting a subject sample, e.g. a blood sample, with a     composition according to the invention or a diagnostic tool     according to the invention, -   (b) detecting binding of an antibody, said binding being an     indication that said subject is infected by Mycobacterium     tuberculosis or is susceptible to Mycobacterium tuberculosis     infection.

In an important embodiment, the invention relates to a serodiagnostic composition comprising a combination of two or more substantially pure polypeptides, which comprises one or more amino acid sequences selected from

-   (a) Rv0652, Rv2462c, Rv1984c, Rv2185c, Rv1636, Rv3451, Rv3872,     Rv3354 and Rv2623; -   (b) an immunogenic portion of any one of the sequences in (a);     and/or -   (c) an amino acid sequence analogue having at least 70% sequence     identity to any one of the sequences in (a) or (b) and at the same     time being immunogenic.

Thus, an important embodiment of the invention is a polypeptide fragment defined above which

1) induces a release of IFN-γ from primed memory T-lymphocytes withdrawn from a mouse within 2 weeks of primary infection or within 4 days after the mouse has been rechallenge infected with mycobacteria belonging to the tuberculosis complex, the induction performed by the addition of the polypeptide to a suspension comprising about 200,000 spleen cells per ml, the addition of the polypeptide resulting in a concentration of 1-4 .mu.g polypeptide per ml suspension, the release of IFN-.gamma. being assessable by determination of IFN-γ in supernatant harvested 2 days after the addition of the polypeptide to the suspension, and/or 2) induces a release of IFN-γ of at least 1,500 pg/ml above background level from about 1,000,000 human PBMC (peripheral blood mononuclear cells) per ml isolated from TB patients in the first phase of infection, or from healthy BCG vaccinated donors, or from healthy contacts to TB patients, the induction being performed by the addition of the polypeptide to a suspension comprising the about 1,000,000 PBMC per ml, the addition of the polypeptide resulting in a concentration of 1-4 .mu.g polypeptide per ml suspension, the release of IFN-gamma. being assessable by determination of IFN-.gamma. in supernatant harvested 2 days after the addition of the polypeptide to the suspension; and/or 3) induces an IFN-γ release from bovine PBMC derived from animals previously sensitized with mycobacteria belonging to the tuberculosis complex, said release being at least two times the release observed from bovine PBMC derived from animals not previously sensitized with mycobacteria belonging to the tuberculosis complex.

Hence, the invention also relates to a vaccine comprising a nucleic acid fragment according to the invention, the vaccine effecting in vivo expression of antigen by an animal, including a human being, to whom the vaccine has been administered, the amount of expressed antigen being effective to confer substantially increased resistance to infections with mycobacteria of the tuberculosis complex in an animal, including a human being.

Apart from their use as starting points for the synthesis of polypeptides of the invention and for hybridization probes (useful for direct hybridization assays or as primers in e.g. PCR or other molecular amplification methods) the nucleic acid fragments of the invention may be used for effecting in vivo expression of antigens, i.e. the nucleic acid fragments may be used in so-called DNA vaccines. Recent research have revealed that a DNA fragment cloned in a vector which is non-replicative in eukaryotic cells may be introduced into an animal (including a human being) by e.g. intramuscular injection or percutaneous administration (the so-called “gene gun” approach). The DNA is taken up by e.g. muscle cells and the gene of interest is expressed by a promoter which is functioning in eukaryotes, e.g. a viral promoter, and the gene product thereafter stimulates the immune system. These newly discovered methods are reviewed in Ulmer et al., 1993, which hereby is included by reference.

Hence, the invention also relates to a vaccine comprising a nucleic acid fragment according to the invention, the vaccine effecting in vivo expression of antigen by an animal, including a human being, to whom the vaccine has been administered, the amount of expressed antigen being effective to confer substantially increased resistance to infections with mycobacteria of the tuberculosis complex in an animal, including a human being.

Therefore, another important aspect of the present invention is an improvement of the living BCG vaccine presently available, which is a vaccine for immunizing an animal, including a human being, against TB caused by mycobacteria belonging to the tuberculosis-complex, comprising as the effective component a microorganism, wherein one or more copies of a DNA sequence encoding a polypeptide as defined above has been incorporated into the genome of the microorganism in a manner allowing the microorganism to express and secrete the polypeptide.

Throughout this specification, unless the context requires otherwise, the word “comprise”, or variations thereof such as “comprises” or “comprising”, will be understood to imply the inclusion of a stated element or integer or group of elements or integers but not the exclusion of any other element or integer or group of elements or integers.

By the term “a polypeptide” in the present application is generally understood a polypeptide of the invention, as will be described later. It is also within the meaning of “a polypeptide” that several polypeptides can be used, i.e. in the present context “a” means “at least one” unless explicitly indicated otherwise. The “polypeptide” is used to refer to short peptides with a length of at least two amino acid residues and at most 10 amino acid residues, oligopeptides (11-100 amino acid residues), and longer peptides (the usual interpretation of “polypeptide”, i.e. more than 100 amino acid residues in length) as well as proteins (the functional entity comprising at least one peptide, oligopeptide, or polypeptide which may be chemically modified by being phosphorylated, glycosylated, by being lipidated, or by comprising prosthetic groups). The definition of polypeptides comprises native forms of peptides/proteins in Mycobacteria as well as recombinant proteins or peptides in any type of expression vectors transforming any kind of host, and also chemically synthesized polypeptides. Within the scope of the invention is a polypeptide which is at least 6 amino acids long, preferably 7, such as 8, 9, 10, 11, 12, 13, 14 amino acids long, preferably at least 15 amino acids, such as 15, 16, 17, 18, 19, 20 amino acids long. However, also longer polypeptides having a length of e.g. 25, 50, 75, 100, 125, 150, 175 or 200 amino acids are within the scope of the present invention.

In the present context the term “purified polypeptide” and substantially pure polypeptide fragment” means a polypeptide preparation which contains at most 5% by weight of other polypeptide material with which it is natively associated (lower percentages of other polypeptide material are preferred, e.g. at most 4%, at most 3%, at most 2%, at most 1%, and at most ½%). It is preferred that the substantially pure polypeptide is at least 96% pure, i.e. that the polypeptide constitutes at least 96% by weight of total polypeptide material present in the preparation, and higher percentages are preferred, such as at least 97%, at least 98%, at least 99%, at least 99.25%, at least 99.5%, and at least 99.75%. It is especially preferred that the polypeptide is in “essentially pure form”, i.e. that the polypeptide is essentially free of any other antigen with which it is natively associated, i.e. free of any other antigen from bacteria belonging to the tuberculosis complex. This can be accomplished by preparing the polypeptide by means of recombinant methods in a non-mycobacterial host cell as will be described in detail below, or by synthesizing the polypeptide by the well-known methods of solid or liquid phase peptide synthesis, e.g. by the method described by Merrifield or variations thereof.

By the terms “somatic protein” or “protein derived from the cell wall, the cell membrane or the cytosol”, or by the abbreviation “SPE” is understood a polypeptide or a protein extract obtainable from a cell or a part. When referring to an “immunologically equivalent” polypeptide is herein meant that the polypeptide, when formulated in a vaccine or a diagnostic agent (i.e. together with a pharmaceutically acceptable carrier or vehicle and optionally an adjuvant), will

I) confer, upon administration (either alone or as an immunologically active constituent together with other antigens), an acquired increased specific resistance in a mouse and/or in a quinea pig and/or in a primate such as a human being against infections with bacteria belonging to the tuberculosis complex by Mycobacterium bovis BCG and also at least 20% of the acquired increased resistance conferred by the parent polypeptide comprising SEQ ID NO: 2.

By the terms “culture filtrate protein”, or by the abbreviation “STCF” is understood a complex mixture of proteins released from M. tuberculosis during the first few days of growth in a liquid medium.

By the term “non-naturally occurring polypeptide” is understood a polypeptide that does not occur naturally. This means that the polypeptide is substantially pure, and/or that the polypeptide has been synthesized in the laboratory, and/or that the polypeptide has been produced by means of recombinant technology.

The “tuberculosis-complex” has its usual meaning, i.e. the complex of mycobacteria causing TB which are Mycobacterium tuberculosis, Mycobacterium bovis, Mycobacterium bovis BCG, and Mycobacterium africanum.

By the term “virulent Mycobacterium” is understood a bacterium capable of causing the tuberculosis disease in a mammal including a human being. Examples of virulent Mycobacteria are M. tuberculosis, M. africanum, and M. bovis.

By “a TB patient” is understood an individual with culture or microscopically proven infection with virulent Mycobacteria, and/or an individual clinically diagnosed with TB and who is responsive to anti-TB chemotherapy. Culture, microscopy and clinical diagnosis of TB is well known by the person skilled in the art.

By the term “PPD positive individual” is understood an individual with a positive Mantoux test or an individual where PPD induces an increase in in vitro recall response determined by release of IFN-γ of at least 1,000 pg/ml from Peripheral Blood Mononuclear Cells (PBMC) or whole blood, the induction being performed by the addition of 2.5 to 5 μg PPD/ml to a suspension comprising about 1.0 to 2.5×10⁵ PBMC, the release of IFN-γ being assessable by determination of IFN-γ in supernatant harvested 5 days after the addition of PPD to the suspension compared to the release of IFN-γ without the addition of PPD.

By the term “delayed type hypersensitivity reaction” is understood a T-cell mediated inflammatory response elicited after the injection of a polypeptide into or application to the skin, said inflammatory response appearing 72-96 hours after the polypeptide injection or application.

By the term “IFN-γ” is understood interferon-gamma.

By the terms “analogue” and “subsequence” when used in connection with polypeptides is meant any polypeptide having the same immunological characteristics as the polypeptides of the invention shown in any of SEQ ID NOs: 8, 30, 34, 38, 149, 64, 10 or 88. Thus, included is also a polypeptide from a different source, such as from another bacterium or even from a eukaryotic cell.

When referring to an “immunologically equivalent” polypeptide is herein meant that the polypeptide, when formulated in a diagnostic agent (i.e. together with a pharmaceutically acceptable carrier or vehicle and optionally an adjuvant), will elicit a diagnostically significant immune response in a mammal indicating previous or ongoing sensitization with antigens derived from mycobacteria belonging to the tuberculosis complex; this diagnostically significant immune response can be in the form of a delayed type hypersensitivity reaction which can e.g. be determined by a skin test, or a serological test. A diagnostically significant response in a skin test setup will be a reaction which gives rise to a skin reaction which is at least 5 mm in diameter and which is at least 65% (preferably at least 75% such as at the least 85%) of the skin reaction (assessed as the skin reaction diameter) elicited by the parent polypeptide comprising SEQ ID NO: 8, 30, 34, 38, 149, 64, 10 and 88.

When the term “nucleotide” is used in the following, it should be understood in the broadest sense. That is, most often the nucleotide should be considered as DNA. However, when DNA can be substituted with RNA, the term nucleotide should be read to include RNA embodiments which will be apparent for the person skilled in the art. For the purposes of hybridization, PNA or LNA may be used instead of DNA. PNA has been shown to exhibit a very dynamic hybridization profile and is described in Nielsen P E et al., 1991, Science 254: 1497-1500). LNA (Locked Nucleic Acids) is a recently introduced oligonucleotide analogue containing bicyclo nucleoside monomers (Koshkin et al., 1998, 54, 3607-3630; Nielsen, N. K. et al. J. Am. Chem. Soc 1998, 120, 5458-5463).

The term “stringent” when used in conjunction with hybridization conditions is as defined in the art, i.e. the hybridization is performed at a temperature not more than 15-20° C. under the melting point Tm, cf. Sambrook et al, 1989, pages 11.45-11.49. Preferably, the conditions are “highly stringent”, i.e. 5-10° C. under the melting point Tm.

The terms “analogue” or “subsequence” when used in connection with the nucleotide fragments of the invention are thus intended to indicate a nucleotide sequence which encodes a polypeptide exhibiting identical or substantially identical immunological properties to a polypeptide encoded by the nucleotide fragment of the invention shown in any of SEQ ID NOs: 7, 29, 33, 37, 148, 63, 9 or 87 allowing for minor variations which do not have an adverse effect on the ligand binding properties and/or biological function and/or immunogenicity as compared to any of the polypeptides of the invention or which give interesting and useful novel binding properties or biological functions and immunogenicities etc. of the analogue and/or subsequence. The analogous nucleotide fragment or nucleotide sequence may be derived from a bacterium, a mammal, or a human or may be partially or completely of synthetic origin. The analogue and/or subsequence may also be derived through the use of recombinant nucleotide techniques.

The term “subsequence” when used in connection with the nucleic acid fragments of the invention is intended to indicate a continuous stretch of at least 10 nucleotides which exhibits the above hybridization pattern. Normally this will require a minimum sequence identity of at least 70% with a subsequence of the hybridization partner having SEQ ID NO: 7, 29, 33, 37, 148, 63, 9 or 87. It is preferred that the nucleic acid fragment is longer than 10 nucleotides, such as at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, at least 55, at least 60, at least 65, at least 70, and at least 80 nucleotides long, and the sequence identity should preferable also be higher than 70%, such as at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 94%, at least 96%, and at least 98%. It is most preferred that the sequence identity is 100%. Such fragments may be readily prepared by, for example, directly synthesizing the fragment by chemical means, by application of nucleic acid reproduction technology, such as the PCR technology of U.S. Pat. No. 4,603,102, or by introducing selected sequences into recombinant vectors for recombinant production.

Furthermore, the terms “analogue” and “subsequence” are intended to allow for variations in the sequence such as substitution, insertion (including introns), addition, deletion and rearrangement of one or more nucleotides, which variations do not have any substantial effect on the polypeptide encoded by a nucleotide fragment or a subsequence thereof. The term “substitution” is intended to mean the replacement of one or more nucleotides in the full nucleotide sequence with one or more different nucleotides, “addition” is understood to mean the addition of one or more nucleotides at either end of the full nucleotide sequence, “insertion” is intended to mean the introduction of one or more nucleotides within the full nucleotide sequence, “deletion” is intended to indicate that one or more nucleotides have been deleted from the full nucleotide sequence whether at either end of the sequence or at any suitable point within it, and “rearrangement” is intended to mean that two or more nucleotide residues have been exchanged with each other.

It is well known that the same amino acid may be encoded by various codons, the codon usage being related, inter alia, to the preference of the organisms in question expressing the nucleotide sequence. Thus, at least one nucleotide or codon of a nucleotide fragment of the invention may be exchanged by others which, when expressed, results in a polypeptide identical or substantially identical to the polypeptide encoded by the nucleotide fragment in question.

The term “sequence identity” indicates a quantitative measure of the degree of homology between two amino acid sequences of equal length or between two nucleotide sequences of equal length. If the two sequences to be compared are not of equal length, they must be aligned to best possible fit. The sequence identity can be calculated as

$\frac{\left( {N_{ref} - N_{dif}} \right)100}{N_{ref}},$

wherein N_(dif) is the total number of non-identical residues in the two sequences when aligned and wherein N_(ref) is the number of residues in one of the sequences. Hence, the DNA sequence AGTCAGTC will have a sequence identity of 75% with the sequence AATCAATC (N_(dif)=2 and N_(ref)=8). A gap is counted as non-identity of the specific residue(s), i.e. the DNA sequence AGTGTC will have a sequence identity of 75% with the DNA sequence AGTCAGTC (N_(dif)=2 and N_(ref)=8). Sequence identity can alternatively be calculated by the BLAST program e.g. the BLASTP program or the BLASTN program (Pearson W. R and D. J. Lipman (1988) PNAS USA 85:2444-2448) (www.ncbi.nlm.nih.gov/BLAST). In one aspect of the invention, alignment is performed with the global align algorithm with default parameters as described by X. Huang and W. Miller. Adv. Appl. Math. (1991) 12:337-357, available at http://www.ch.embnet.org/software/LALIGN_form.html.

The sequence identity is used here to illustrate the degree of identity between the amino acid sequence of a given polypeptide and the amino acid sequence shown in SEQ ID NO: 8, 30, 34, 38, 149, 64, 10 and 88. The amino acid sequence to be compared with the amino acid sequence shown in SEQ ID NO: 8, 30, 34, 38, 149, 64, 10 and 88 may be deduced from a DNA sequence, e.g. obtained by hybridization as defined below, or may be obtained by conventional amino acid sequencing methods. The sequence identity is preferably determined on the amino acid sequence of a mature polypeptide, i.e. without taking any leader sequence into consideration.

As appears from the above disclosure, polypeptides which are not identical to the polypeptides having SEQ ID NO: 8, 30, 34, 38, 149, 64, 10 and 88 are embraced by the present invention. The invention allows for minor variations which do not have an adverse effect on immunogenicity compared to the parent sequences and which may give interesting and useful novel binding properties or biological functions and immunogenicities etc.

Each polypeptide fragment may thus be characterized by specific amino acid and nucleic acid sequences. It will be understood that such sequences include analogues and variants produced by recombinant methods wherein such nucleic acid and polypeptide sequences have been modified by substitution, insertion, addition and/or deletion of one or more nucleotides in said nucleic acid sequences to cause the substitution, insertion, addition or deletion of one or more amino acid residues in the recombinant polypeptide. When the term DNA is used in the following, it should be understood that for the number of purposes where DNA can be substituted with RNA, the term DNA should be read to include RNA embodiments which will be apparent for the man skilled in the art. For the purposes of hybridization, PNA may be used instead of DNA, as PNA has been shown to exhibit a very dynamic hybridization profile (PNA is described in Nielsen P E et al., 1991, Science 254: 1497-1500).

The nucleotide sequence to be modified may be of cDNA or genomic origin as discussed above, but may also be of synthetic origin. Furthermore, the sequence may be of mixed cDNA and genomic, mixed cDNA and synthetic or genomic and synthetic origin as discussed above. The sequence may have been modified, e.g. by site-directed mutagenesis, to result in the desired nucleic acid fragment encoding the desired polypeptide.

The nucleotide sequence may be modified using any suitable technique which results in the production of a nucleic acid fragment encoding a polypeptide of the invention.

The modification of the nucleotide sequence encoding the amino acid sequence of the polypeptide of the invention should be one which does not impair the immunological function of the resulting polypeptide.

In particular, the invention relates to a polypeptide obtained from M. tuberculosis, which polypeptide has at least one of the following properties:

i) it induces a specific antibody response in a TB patient as determined by an ELISA technique or a western blot when the whole blood is diluted 1:20 in PBS and stimulated with the polypeptide in a concentration of at the most 20 μg/ml and induces an OD of at least 0.1 in ELISA, or a visual response in western blot. ii) it induces a positive DTH response determined by intradermal injection or local application patch of at most 100 μg of the polypeptide to an individual who is clinically or subclinically infected with a virulent Mycobacterium, a positive response having a diameter of at least 10 mm 72-96 hours after the injection or application, iii) it induces a positive DTH response determined by intradermal injection or local application patch of at most 100 μg of the polypeptide to an individual who is clinically or subclinically infected with a virulent Mycobacterium, a positive response having a diameter of at least 10 mm 72-96 hours after the injection, and preferably does not induce a such response in an individual who has a cleared infection with a virulent Mycobacterium.

Any polypeptide fulfilling one or more of the above properties and which is obtainable from either the cell wall, cell membrane, the cytosol or STCF is within the scope of the present invention.

The property described in i) will in particular be satisfied, if the ELISA is performed as follows: the polypeptide of interest in the concentration of 1 to 10 μg/ml is coated on a 96 wells polystyrene plate (NUNC, Denmark) and after a washing step with phosphate buffer pH 7.3, containing 0.37 M NaCl and 0.5% Tween-20 the serum or plasma from a TB patient is applied in dilution's from 1:10 to 1:1000 in PBS with 1% Tween-20. Binding of an antibody to the polypeptide is determined by addition of a labeled (e.g. peroxidase labeled) secondary antibody and reaction is thereafter visualized by the use of OPD and H₂O₂ as described by the manufacturer (DAKO, Denmark). The OD value in each well is determined using an appropriate ELISA reader.

In a preferred embodiment the western blot is performed as follows: The polypeptide is applied in concentrations from 1-40 μg to a SDS-PAGE and after electrophoresis the polypeptide is transferred to a membrane e.g. nitrocellulose or PVDF. The membrane is thereafter washed in phosphate buffer, pH 7.3, containing 0.37 M NaCl and 0.5% Tween-20 for 30 min. The sera obtained from one or more TB patients were diluted 1:10 to 1:1000 in phosphate buffer pH 7.3 containing 0.37 M NaCl. The membrane is hereafter washed four times five minutes in binding buffer and incubated with peroxidase- or phosphates-labeled secondary antibody. Reaction is then visualized using the staining method recommended by the manufacture (DAKO, Denmark).

The property described in ii) will in particular be satisfied if the polypeptide does not induce such a response in an individual not infected with a virulent Mycobacterium, i.e. an individual who has been BCG vaccinated or infected with Mycobacterium avium or sensitized by non-tuberculosis Mycobacterium. In a preferred embodiment the amount of polypeptide intradermally injected or applied is 90 μg, such as 80 μg, 70 μg, 60 μg, 50 μg, 40 μg, or 30 μg. In another embodiment of the invention, the diameter of the positive response is at least 11 mm, such as 12 mm, 13 mm, 14 mm, or 15 mm. In a preferred embodiment the induration of erythema or both could be determined after administration of the polypeptide by intradermal injection, patch test or multipuncture. The reaction diameter could be positive after more than 48, such as 72 or 96 hours.

The property described in ii) will in particular be satisfied if the polypeptide does not induce such a response in an individual cleared of an infection with a virulent Mycobacterium, i.e. which does not have any positive culture or microscopically proven ongoing infection with virulent Mycobacterium. The comments on property ii) regarding the amount of polypeptide intradermally injected or applied and the diameter of the positive response are equally relevant to property iii).

In immunodiagnostics, it is often possible and practical to prepare antigens from segments of a known immunogenic protein or polypeptide. Certain epitopic regions may be used to produce responses similar to those produced by the entire antigenic polypeptide. Potential antigenic or immunogenic regions may be identified by any of a number of approaches, e.g., Jameson-Wolf or Kyte-Doolittle antigenicity analyses or Hopp and Woods (Hopp et Woods, (1981), Proc Natl Acad Sci USA 78/6:3824-8) hydrophobicity analysis (see, e.g., Jameson and Wolf, (1988) Comput Appl Biosci, 4(1):181-6; Kyte and Doolittle, (1982) J Mol Biol, 157(1):105-32; or U.S. Pat. No. 4,554,101). Hydrophobicity analysis assigns average hydrophilicity values to each amino acid residue; from these values average hydrophilicities can be calculated and regions of greatest hydrophilicity determined. Using one or more of these methods, regions of predicted antigenicity may be derived from the amino acid sequence assigned to the polypeptides of the invention. Alternatively, in order to identify relevant T-cell epitopes which are recognized during an immune response, it is also possible to use a “brute force” method: Since T-cell epitopes are linear, deletion mutants of polypeptides will, if constructed systematically, reveal what regions of the polypeptide are essential in immune recognition. A presently preferred method utilises overlapping oligomers (preferably synthetic ones having a length of e.g. 20 amino acid residues) derived from the polypeptide. A preferred T-cell epitope is a T-helper cell epitope or a cytotoxic T-cell epitope.

Although the minimum length of a T-cell epitope has been shown to be at least 6 amino acids, it is normal that such epitopes are constituted of longer stretches of amino acids. Hence it is preferred that the polypeptide fragment of the invention has a length of at least 7 amino acid residues, such as at least 8, at least 9, at least 10, at least 12, at least 14, at least 16, at least 18, at least 20, at least 22, at least 24, or at least 30 amino acid residues.

B-cell epitopes may be linear or spatial. The three-dimensional structure of a protein is often such that amino acids, which are located distant from each other in the one-dimensional structure, are located near to each other in the folded protein. Within the meaning of the present context, the expression epitope is intended to comprise the one- and three-dimensional structure as well as mimics thereof. The term is further intended to include discontinuous B-cell epitopes. The linear B-cell epitopes can be identified in a similar manner as described for the T-cell epitopes above. However, when identifying B-cell epitopes the assay should be an ELISA using overlapping oligomers derived from the polypeptide as the coating layer on a microtiter plate as described elsewhere.

A non-naturally occurring polypeptide, an analogue, a subsequence, a T-cell epitope and/or a B-cell epitope of any of the described polypeptides are defined as any non-naturally occurring polypeptide, analogue, subsequence, T-cell epitope and/or B-cell epitope of any of the polypeptides inducing a specific antibody response in a TB patient.

Preferred embodiments of the invention are the specific polypeptides which have been identified and analogues and subsequences thereof. It has been noted that none of the identified polypeptides in the examples include a signal sequence. Table 1 lists the antigens of the invention.

TABLE 1 The antigens of the invention by the names used herein as well as by reference to relevant SEQ ID NOs of N-terminal sequences, full amino acid sequences and sequences of nucleotides encoding the antigens Nucleotide sequence Amino acid sequence Antigen Sanger ID SEQ ID NO: SEQ ID NO: TB15A Rv1636 7 8 TB16 Rv2185c 29 30 TB32 Rv2623 33 34 TB51 Rv2462c 37 38 CFP8A Rv3354 148 149 CFP16 Rv0652 63 64 CFP21 Rv1984c 9 10 CFP23 Rv3451 55 56 RD1-ORF3 Rv3872 87 88

Until the present invention was made, it was unknown that the polypeptides Rv1636, Rv2185c, Rv2623, Rv2462c, Rv3354, Rv0652, Rv1984c, Rv3451 or Rv3872 with the amino acid sequences disclosed in SEQ ID NOs: 8, 30, 34, 38, 149, 64, 10, 56 and 88 are expressed in live virulent Mycobacterium. These polypeptides in purified form, or non-naturally occurring, i.e. recombinantly or synthetically produced, are considered part of the invention. It is understood that a polypeptide which has the above mentioned property and has a sequence identity of at least 80% with any of the amino acid sequences shown in SEQ ID NOs: 8, 30, 34, 38, 149, 64, 10, 56 and 88 or has a sequence identity of at least 80% to any subsequence thereof is considered part of the invention. In a preferred embodiment the sequence identity is at least 80%, such as 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 99.5%. Furthermore, any T cell epitope of the polypeptides disclosed in SEQ ID NOs: 8, 30, 34, 38, 149, 64, 10 56, and 88 is considered part of the invention. Also, any B-cell epitope of the polypeptides disclosed in SEQ ID NOs: 8, 30, 34, 38, 149, 64, 10, 56 and 88 is considered part of the invention.

The invention also relates to a replicable expression vector which comprises a nucleic acid fragment defined above, especially a vector which comprises a nucleic acid fragment encoding a polypeptide fragment of the invention. The vector may be any vector which may conveniently be subjected to recombinant DNA procedures, and the choice of vector will often depend on the host cell into which it is to be introduced. Thus, the vector may be an autonomously replicating vector, i.e. a vector which exists as an extrachromosomal entity, the replication of which is independent of chromosomal replication; examples of such a vector are a plasmid, phage, cosmid, mini-chromosome and virus. Alternatively, the vector may be one which, when introduced in a host cell, is integrated in the host cell genome and replicated together with the chromosome(s) into which it has been integrated.

Expression vectors may be constructed to include any of the DNA segments disclosed herein. Such DNA might encode an antigenic protein specific for virulent strains of mycobacteria or even hybridization probes for detecting mycobacteria nucleic acids in samples. Longer or shorter DNA segments could be used, depending on the antigenic protein desired. Epitopic regions of the proteins expressed or encoded by the disclosed DNA could be included as relatively short segments of DNA. A wide variety of expression vectors is possible including, for example, DNA segments encoding reporter gene products useful for identification of heterologous gene products and/or resistance genes such as antibiotic resistance genes which may be useful in identifying transformed cells.

The vector of the invention may be used to transform cells so as to allow propagation of the nucleic acid fragments of the invention or so as to allow expression of the polypeptide fragments of the invention. Hence, the invention also pertains to a transformed cell harbouring at least one such vector according to the invention, said cell being one which does not natively harbour the vector and/or the nucleic acid fragment of the invention contained therein. Such a trans-formed cell (which is also a part of the invention) may be any suitable bacterial host cell or any other type of cell such as a unicellular eukaryotic organism, a fungus or yeast, or a cell derived from a multicellular organism, e.g. an animal or a plant. It is especially in cases where glycosylation is desired that a mammalian cell is used, although glycosylation of proteins is a rare event in prokaryotes. Normally, however, a prokaryotic cell is preferred such as a bacterium belonging to the genera Mycobacterium, Salmonella, Pseudomonas, Bacillus and Eschericia. It is preferred that the transformed cell is an E. coli, B. subtilis, or M. bovis BCG cell, and it is especially preferred that the transformed cell expresses a polypeptide according of the invention. The latter opens for the possibility to produce the polypeptide of the invention by simply recovering it from the culture containing the transformed cell. In the most preferred embodiment of this part of the invention the transformed cell is Mycobacterium bovis BCG strain: Danish 1331, which is the Mycobacterium bovis strain Copenhagen from the Copenhagen BCG Laboratory, Statens Seruminstitut, Denmark.

The nucleic acid fragments of the invention allow for the recombinant production of the polypeptides fragments of the invention. However, also isolation from the natural source is a way of providing the polypeptide fragments as is peptide synthesis.

Therefore, the invention also pertains to a method for the preparation of a polypeptide fragment of the invention, said method comprising inserting a nucleic acid fragment as described in the present application into a vector which is able to replicate in a host cell, introducing the resulting recombinant vector into the host cell (transformed cells may be selected using various techniques, including screening by differential hybridization, identification of fused reporter gene products, resistance markers, anti-antigen antibodies and the like), culturing the host cell in a culture medium under conditions sufficient to effect expression of the polypeptide (of course the cell may be cultivated under conditions appropriate to the circumstances, and if DNA is desired, replication conditions are used), and recovering the polypeptide from the host cell or culture medium; or

-   -   isolating the polypeptide from a short-term culture filtrate; or     -   isolating the polypeptide from whole mycobacteria of the         tuberculosis complex or from lysates or fractions thereof, e.g.         cell wall containing fractions, or     -   synthesizing the polypeptide by solid or liquid phase peptide         synthesis.

The medium used to grow the transformed cells may be any conventional medium suitable for the purpose. A suitable vector may be any of the vectors described above, and an appropriate host cell may be any of the cell types listed above. The methods employed to construct the vector and effect introduction thereof into the host cell may be any methods known for such purposes within the field of recombinant DNA. In the following a more detailed description of the possibilities will be given:

In general, of course, prokaryotes are preferred for the initial cloning of nucleic sequences of the invention and constructing the vectors useful in the invention. For example, in addition to the particular strains mentioned in the more specific disclosure below, one may mention by way of example, strains such as E. Coli K12 strain 294 (ATCC No. 31446), E. coli B, and E. coli X 1776 (ATCC No. 31537). These examples are, of course, intended to be illustrative and not limiting.

Prokaryotes are also preferred for expression. The aforementioned strains, as well as E. coli W3110 (F-, lambda-, prototrophic, ATCC No. 273325), bacilli such as Bacillus subtilis, or other enterobacteriaceae such as Salmonella typhimurium or Serratia marcesans, and various Pseudomonas species may be used. Especially interesting are rapid-growing mycobacteria, e.g. M. smegmatis, as these bacteria have a high degree of resemblance with mycobacteria of the tuberculosis complex and therefore stand a good chance of reducing the need of performing post-translational modifications of the expression product.

In general, plasmid vectors containing replicon and control sequences which are derived from species compatible with the host cell are used in connection with these hosts. The vector ordinarily carries a replication site, as well as marking sequences which are capable of providing phenotypic selection in transformed cells. For example, E. coli is typically transformed using pBR322, a plasmid derived from an E. coli species (see, e.g., Bolivar et al., 1977, Gene 2: 95). The pBR322 plasmid contains genes for ampicillin and tetracycline resistance and thus provides easy means for identifying transformed cells. The pBR plasmid, or other microbial plasmids or phages must also contain, or be modified to contain, promoters which can be used by the microorganism for expression.

Those promoters most commonly used in recombinant DNA construction include the B-lactamase (penicillinase) and lactose promoter systems (Chang et al., (1978), Nature, 35:515; Itakura et al., (1977), Science 198:1056; Goeddel et al., (1979), Nature 281:544) and a tryptophan (trp) promoter system (Goeddel et al., (1979) Nature 281:544; EPO Appl. Publ. No. 0036776). While these are the most commonly used, other microbial promoters have been discovered and utilized, and details concerning their nucleotide sequences have been published, enabling a skilled worker to ligate them functionally with plasmid vectors (Siebwenlist et al., (1980), Cell, 20:269). Certain genes from prokaryotes may be expressed efficiently in E. coli from their own promoter sequences, precluding the need for addition of another promoter by artificial means.

After the recombinant preparation of the polypeptide according to the invention, the isolation of the polypeptide may for instance be carried out by affinity chromatography (or other conventional biochemical procedures based on chromatography), using a monoclonal antibody which substantially specifically binds the polypeptide according to the invention. Another possibility is to employ the simultaneous electroelution technique described by Andersen et al. in J. Immunol. Methods 161: 29-39.

According to the invention the post-translational modifications involves lipidation, glycosylation, cleavage, or elongation of the polypeptide.

Individuals infected with virulent Mycobacteria can generally be divided into two groups. The first group has an infection with a virulent Mycobacterium e.g. contacts of TB patients. The virulent Mycobacterium may have established colonies in the lungs, but the individual has, as yet, no symptoms of TB. The second group has clinical symptoms of TB, as a TB patient.

In one embodiment of the invention, any of the above mentioned polypeptides are used for the manufacture of a diagnostic reagent that preferably distinguishes a subclinically or clinically infected individual (group I and group II) from an individual who has been BCG vaccinated or infected with Mycobacterium avium or sensitized by non-tuberculosis Mycobacterium (NTM), and may distinguish a subclinically or clinically infected individual from an individual who has cleared a previous infection with a virulent Mycobacterium. It is most likely that specific polypeptides derived from SPE or ST-CF will identify group I and/or group II from individuals not infected with virulent Mycobacteria in the same way as ESAT-6 and CFP10 (P. Ravn et al., (1998), J. Infectious Disease 179:637-45).

In another embodiment of the invention, any of the above discussed polypeptides are used for the manufacture of a diagnostic reagent for the diagnosis of an infection with a virulent Mycobacterium. One embodiment of the invention provides a diagnostic reagent for differentiating an individual who is clinically or subclinically infected with a virulent Mycobacterium from an individual not infected with virulent Mycobacterium, i.e. an individual who has been BCG vaccinated or infected with Mycobacterium avium or sensitized by non-tuberculosis Mycobacterium (NTM). Such a diagnostic reagent will distinguish between an individual in group I and/or II of the infection stages above, from an individual who has been vaccinated against TB.

Another embodiment of the invention provides a diagnostic reagent for differentiating an individual who is clinically or subclinically infected with a virulent Mycobacterium from an individual who has a cleared infection with a virulent Mycobacterium. Such a diagnostic reagent will distinguish between an individual in group I and/or II of the infection stages above, from an individual who has cleared the infection.

Determination of an infection with virulent Mycobacterium will be instrumental in the, still very laborious, diagnostic process of tuberculosis. A number of possible diagnostic assays and methods can be envisaged (some more specifically described in the examples and the list of properties): a sample comprising whole blood or mononuclear cells (i.a. T-lymphocytes) from a patient could be contacted with a sample of one or more polypeptides of the invention. This contacting can be performed in vitro and a positive reaction could e.g. be proliferation of the T-cells or release of cytokines such as IFN-γ into the extracellular phase (e.g. into a culture supernatant).

Alternatively, a sample of a possibly infected organ may be contacted with an antibody raised against a polypeptide of the invention. The demonstration of the reaction by means of methods well-known in the art between the sample and the antibody will be indicative of ongoing infection and could be used to monitor treatment effect by reduction in responses.

It is of course also a possibility to demonstrate the presence of anti-Mycobacterial antibodies in serum by contacting a serum sample from a subject with at least one of the polypeptide fragments of the invention and using well-known methods for visualising the reaction between the antibody and antigen such as ELISA, Western blot, precipitation assays.

The invention also relates to a method of diagnosing infection caused by a virulent Mycobacterium in a mammal, including a human being, comprising locally applying (patch test) or intradermally injecting (Mantoux test) a polypeptide of the invention. These tests are both called a delayed hypersensitivity reaction (DTH). A positive skin response at the location of injection or application is indicative of the mammal including a human being, being infected with a virulent Mycobacterium, and a negative skin response at the location of injection or application is indicative of the mammal including a human being not having TB. A positive response is a skin reaction having a diameter of at least 5 mm larger than background, but larger reactions are preferred, such as at least 1 cm, 1.5 cm, and at least 2 cm in diameter. A skin reaction is here to mean erythema or induration of the skin, as directly measured. The composition used as the skin test reagent can be prepared in the same manner as described for the vaccines above.

In human volunteers, the generation of a significant immune response can alternatively be defined as the ability of the reagent being tested to stimulate an in vitro recall response by peripheral blood cells from at least 30% of PPD positive individuals previously vaccinated with that reagent or infected with a virulent Mycobacterium, said recall response being defined as proliferation of T cells or the production of cytokine(s) which is higher than the responses generated by cells from unimmunized or uninfected control individuals, with a 95% confidence interval as defined by an appropriate statistical analysis such as a Student's two-tailed T test.

The polypeptides according to the invention may be potential drug targets. The fact that certain of the disclosed antigens are not present in M. bovis BCG but are present in virulent mycobacteria point them out as interesting drug targets; the antigens may constitute receptor molecules or toxins which facilitate the infection by the mycobacterium, and if such functionalities are blocked the infectivity of the mycobacterium will be diminished.

To determine particularly suitable drug targets among the antigens of the invention, the gene encoding at least one of the polypeptides of the invention and the necessary control sequences can be introduced into avirulent strains of mycobacteria (e.g. BCG) so as to determine which of the polypeptides are critical for virulence. Once particular proteins are identified as critical for/contributory to virulence, anti-mycobacterial agents can be designed rationally to inhibit expression of the critical genes or to attack the critical gene products. For instance, antibodies or fragments thereof (such as Fab and (Fab′)₂ fragments can be prepared against such critical polypeptides by methods known in the art and thereafter used as prophylactic or therapeutic agents. Alternatively, small molecules can be screened for their ability to selectively inhibit expression of the critical gene products, e.g. using recombinant expression systems which include the gene's endogenous promoter, or for their ability to directly interfere with the action of the target. These small molecules are then used as therapeutics or as prophylactic agents to inhibit mycobacterial virulence.

Alternatively, anti-mycobacterial agents which render a virulent mycobacterium avirulent can be operably linked to expression control sequences and used to transform a virulent mycobacterium. Such anti-mycobacterial agents inhibit the replication of a specified mycobacterium upon transcription or translation of the agent in the mycobacterium. Such a “newly avirulent” mycobacterium would constitute a superb alternative to the above described modified BCG for vaccine purposes since it would be immunologically very similar to a virulent mycobacterium compared to e.g. BCG.

Once a particular interesting polypeptide has been identified, the biological function of that polypeptide may be tested. The polypeptides may constitute receptor molecules or toxins which facilitates the infection by the Mycobacterium and if such functionality is blocked, the infectivity of the virulent Mycobacterium will be diminished.

The biological function of particular interesting polypeptides may be tested by studying the effect of inhibiting the expression of the polypeptides on the virulence of the virulent Mycobacterium. This inhibition may be performed at the gene level such as by blocking the expression using antisense nucleic acid, PNA or LNA or by interfering with regulatory sequences or the inhibition may be at the level of translation or post-translational processing of the polypeptide.

Once a particular polypeptide according to the invention is identified as critical for virulence, an anti-mycobacterial agent might be designed to inhibit the expression of that polypeptide. Such anti-mycobacterial agent might be used as a prophylactic or therapeutic agent. For instance, antibodies or fragments thereof, such as Fab and (Fab′)₂ fragments, can be prepared against such critical polypeptides by methods known in the art and thereafter used as prophylactic or therapeutic agents

A monoclonal or polyclonal antibody, which is specifically reacting with a polypeptide of the invention in an immuno assay, or a specific binding fragment of said antibody, is also a part of the invention. The production of such polyclonal antibodies requires that a suitable animal be immunized with the polypeptide and that these antibodies are subsequently isolated, suitably by immune affinity chromatography. The production of monoclonals can be effected by methods well-known in the art, since the present invention provides for adequate amounts of antigen for both immunization and screening of positive hybridomas.

As will appear from the examples, a number of the polypeptides of the invention are natively translation products which include a leader sequence (or other short peptide sequences), whereas the product which can be isolated from short-term culture filtrates from bacteria belonging to the tuberculosis complex are free of these sequences. Although it may in some applications be advantageous to produce these polypeptides recombinantly and in this connection facilitate export of the polypeptides from the host cell by including information encoding the leader sequence in the gene for the polypeptide, it is more often preferred to either substitute the leader sequence with one which has been shown to be superior in the host system for effecting export, or to totally omit the leader sequence (e.g. when producing the polypeptide by peptide synthesis. Hence, a preferred embodiment of the invention is a polypeptide which is free from amino acid residues −32 to −1 in SEQ ID NO: 10 and/or −33 to −1 in SEQ ID NO: 56.

In another preferred embodiment, the polypeptide fragment of the invention is free from any signal sequence; this is especially interesting when the polypeptide fragment is produced synthetically but even when the polypeptide fragments are produced recombinantly it is normally acceptable that they are not exported by the host cell to the periplasm or the extracellular space; the polypeptide fragments can be recovered by traditional methods (cf. the discussion below) from the cytoplasm after disruption of the host cells, and if there is need for refolding of the polypeptide fragments, general refolding schemes can be employed, cf. e.g. the disclosure in WO 94/18227 where such a general applicable refolding method is described.

As mentioned above, it will normally be interesting to omit the leader sequences from the polypeptide fragments of the invention. However, by producing fusion polypeptides, superior characteristics of the polypeptide fragments of the invention can be achieved. For instance, fusion partners which facilitate export of the polypeptide when produced recombinantly, fusion partners which facilitate purification of the polypeptide, and fusion partners which enhance the immunogenicity of the polypeptide fragment of the invention are all interesting possibilities. Therefore, the invention also pertains to a fusion polypeptide comprising at least one polypeptide fragment defined above and at least one fusion partner. The fusion partner can, in order to enhance immunogenicity, e.g. be selected from the group consisting of another polypeptide fragment as defined above (so as to allow for multiple expression of relevant epitopes), and an other polypeptide derived from a bacterium belonging to the tuberculosis complex, such as ESAT-6, MPB64, MPT64, and MPB59 or at least one T-cell epitope of any of these antigens. Other immunogenicity enhancing polypeptides which could serve as fusion partners are T-cell epitopes (e.g. derived from the polypeptides ESAT-6, MPB64, MPT64, or MPB59) or other immunogenic epitopes enhancing the immunogenicity of the target gene product, e.g. lymphokines such as IFN-γ, IL-2 and IL-12. In order to facilitate expression and/or purification the fusion partner can e.g. be a bacterial fimbrial protein, e.g. the pilus components pilin and papA; protein A; the ZZ-peptide (ZZ-fusions are marketed by Pharmacia in Sweden); the maltose binding protein; glutathione S-transferase; β-galactosidase; or polyhistidine.

Also a method of determining the presence of virulent Mycobacterium nucleic acids in a mammal, including a human being, or in a sample, comprising incubating the sample with a nucleic acid sequence of the invention or a nucleic acid sequence complementary thereto, and detecting the presence of hybridized nucleic acids resulting from the incubation (by using the hybridization assays which are well-known in the art), is included in the invention. Such a method of diagnosing TB might involve the use of a composition comprising at least a part of a nucleotide sequence as defined above and detecting the presence of nucleotide sequences in a sample from the animal or human being to be tested which hybridizes with the nucleic acid sequence (or a complementary sequence) by the use of PCR techniques.

In certain aspects, the DNA sequence information provided by this invention allows for the preparation of relatively short DNA (or RNA or PNA) sequences having the ability to specifically hybridize to mycobacterial gene sequences. In these aspects, nucleic acid probes of an appropriate length are prepared based on a consideration of the relevant sequence. The ability of such nucleic acid probes to specifically hybridize to the mycobacterial gene sequences lend them particular utility in a variety of embodiments. Most importantly, the probes can be used in a variety of diagnostic assays for detecting the presence of pathogenic organisms in a given sample. However, either uses are envisioned, including the use of the sequence information for the preparation of mutant species primers, or primers for use in preparing other genetic constructs.

In one embodiment of the invention a composition is produced comprising as the effective component a micro-organism, the micro-organism is a bacterium such as Mycobacterium, Salmonella, Pseudomonas and Escherichia, preferably Mycobacterium bovis BCG wherein at least one, such as at least 2 copies, such as at least 5 copies of a nucleotide fragment comprising a nucleotide sequence encoding a polypeptide of the invention has been incorporated into the genome of the micro-organism or introduced as a part of an expression vector in a manner allowing the micro-organism to express and optionally secrete the polypeptide. In a preferred embodiment, the composition comprises at least 2 different nucleotide sequences encoding at least 2 different polypeptides of the invention.

Another part of the invention pertains to a nucleic acid fragment in isolated form which

-   1) comprises a nucleic acid sequence which encodes a polypeptide or     fusion polypeptide as defined above, or comprises a nucleic acid     sequence complementary thereto, and/or -   2) has a length of at least 10 nucleotides and hybridizes readily     under stringent hybridization conditions (as defined in the art,     i.e. 5-10° C. under the melting point T_(m), cf. Sambrook et al,     1989, pages 11.45-11.49) with a nucleic acid fragment which has a     nucleotide sequence selected from     -   SEQ ID NO: 7 or a sequence complementary thereto,     -   SEQ ID NO: 29 or a sequence complementary thereto,     -   SEQ ID NO: 33 or a sequence complementary thereto,     -   SEQ ID NO: 37 or a sequence complementary thereto,     -   SEQ ID NO: 148 or a sequence complementary thereto,     -   SEQ ID NO: 63 or a sequence complementary thereto,     -   SEQ ID NO: 9 or a sequence complementary thereto,     -   SEQ ID NO: 55 or a sequence complementary thereto,     -   SEQ ID NO: 87 or a sequence complementary thereto

It is preferred that the nucleic acid fragment is a DNA fragment.

To provide certainty of the advantages in accordance with the invention, the preferred nucleic acid sequence when employed for hybridization studies or assays includes sequences that are complementary to at least a 10 to 40, or so, nucleotide stretch of the selected sequence. A size of at least 10 nucleotides in length helps to ensure that the fragment will be of sufficient length to form a duplex molecule that is both stable and selective. Molecules having complementary sequences over stretches greater than 10 bases in length are generally preferred, though, in order to increase stability and selectivity of the hybrid, and thereby improve the quality and degree of specific hybrid molecules obtained.

A preferred immunologic composition according to the present invention comprising at least two different polypeptide fragments, each different polypeptide fragment being a polypeptide or a fusion polypeptide defined above. It is preferred that the immunologic composition comprises between 3-20 different polypeptide fragments or fusion polypeptides.

EXAMPLES Example 1A Isolation of CFP21

ST-CF was precipitated with ammonium sulphate at 80% saturation. The precipitated proteins were removed by centrifugation and after resuspension washed with 8 M urea. CHAPS and glycerol were added to a final concentration of 0.5% (w/v) and 5% (v/v) respectively and the protein solution was applied to a Rotofor isoelectrical Cell (BioRad). The Rotofor Cell had been equilibrated with an 8 M urea buffer containing 0.5% (w/v) CHAPS, 5% (v/v) glycerol, 3% (v/v) Biolyt 3/5 and 1% (v/v) Biolyt 4/6 (BioRad). Isoelectric focusing was performed in a pH gradient from 3-6. The fractions were analyzed on silver-stained 10-20% SDS-PAGE. Fractions with similar band patterns were pooled and washed three times with PBS on a Centriprep concentrator (Amicon) with a 3 kDa cut off membrane to a final volume of 1-3 ml. An equal volume of SDS containing sample buffer was added and the protein solution boiled for 5 min before further separation on a Prep Cell (BioRad) in a matrix of 16% polyacrylamide under an electrical gradient. Fractions containing pure proteins with an molecular mass from 17-30 kDa were collected.

N-Terminal Sequencing and Amino Acid Analysis

CFP21 was washed with water on a Centricon concentrator (Amicon) with cutoff at 10 kDa and then applied to a ProSpin concentrator (Applied Biosystems) where the proteins were collected on a PVDF membrane. The membrane was washed 5 times with 20% methanol before sequencing on a Procise sequencer (Applied Biosystems).

The following N-terminal sequence was obtained:

For CFP21: D P X S D I A V V F A R G T H (SEQ ID NO: 150)

“X” denotes an amino acid which could not be determined by the sequencing method used, whereas a “/” between two amino acids denotes that the sequencing method could not determine which of the two amino acids is the one actually present.

Homology Searches in the Sanger Database

For CFP21 the N-terminal amino acid sequence was used for a homology search using the blast program of the Sanger Mycobacterium tuberculosis database:

http://www.sanger.ac.uk/pathogens/TB-blast-server.html.

Thereby, the following information was obtained:

CFP21

A sequence 100% identical to the 14 determined amino acids of CFP21 was found at MTCY39. From the N-terminal sequencing it was not possible to determine amino acid number 3; this amino acid is a C in MTCY39. The amino acid C can not be detected on a Sequencer which is probably the explanation of this difference.

Within the open reading frame the translated protein is 217 amino acids long. The N-terminally determined sequence from the protein purified from culture filtrate starts at amino acid 33 in agreement with the presence of a signal sequence that has been cleaved off. This gives a length of the mature protein of 185 amino acids, which corresponds to a theoretical molecular weigh at 18657 Da, and a theoretical pl at 4.6. The observed weight in a SDS-PAGE is 21 kDa.

In a 193 amino acids overlap the protein has 32.6% identity to a cutinase precursor with a length of 209 amino acids (CUTI_ALTBR P41744).

A comparison of the 14 N-terminal determined amino acids with the translated region (RD2) deleted in M. bovis BCG revealed a 100% identical sequence (mb3484) (Mahairas et al. (1996)).

CFP21: (SEQ ID NO: 10) 1 MTPRSLVRIV GVVVATTLAL VSAPAGGRAA HADPCSDIAV 41 VFARGTHQAS GLGDVGEAFV DSLTSQVGGR SIGVYAVNYP ASDDYRASAS 91 NGSDDASAHI QRTVASCPNT RIVLGGYSQG ATVIDLSTSA MPPAVADHVA 141 AVALFGEPSS GFSSMLWGGG SLPTIGPLYS SKTINLCAPD DPICTGGGNI 191 MAHVSYVQSG MTSQAATFAA NRLDHAG

Cloning of the Gene Encoding CFP21

The gene encoding CFP21 was cloned into the expression vector pMCT6, by PCR amplification with gene specific primers, for recombinant expression in E. coli of the proteins.

PCR reactions contained 10 ng of M. tuberculosis chromosomal DNA in 1× low salt Taq+ buffer from Stratagene supplemented with 250 mM of each of the four nucleotides (Boehringer Mannheim), 0.5 mg/ml BSA (IgG technology), 1% DMSO (Merck), 5 pmoles of each primer and 0.5 unit Tag+ DNA polymerase (Stratagene) in 10 μl reaction volume. Reactions were initially heated to 94° C. for 25 sec. and run for 30 cycles according to the following program; 94° C. for 10 sec., 55° C. for 10 sec. and 72° C. for 90 sec, using thermocycler equipment from Idaho Technology.

The DNA fragments were subsequently run on 1% agarose gels, the bands were excised and purified by Spin-X spin columns (Costar) and cloned into pBluescript SK II+-T vector (Stratagene). Plasmid DNA was thereafter prepared from clones harbouring the desired fragments, digested with suitable restriction enzymes and subcloned into the expression vector pMCT6 in frame with 8 histidine residues which are added to the N-terminal of the expressed proteins. The resulting clones were hereafter sequenced by use of the dideoxy chain termination method adapted for supercoiled DNA using the Sequenase DNA sequencing kit version 1.0 (United States Biochemical Corp., USA) and by cycle sequencing using the Dye Terminator system in combination with an automated gel reader (model 373A; Applied Biosystems) according to the instructions provided. Both strands of the DNA were sequenced.

For cloning the following gene specific primers were used:

CFP21: Primers used for cloning of cfp21: OPBR-55: ACAGATCTGCGCATGCGGATCCGTGT (SEQ ID NO: 151) OPBR-56: TTTTCCATGGTCATCCGGCGTGATCGAG (SEQ ID NO: 152)

OPBR-55 and OPBR-56 create BglII and NcoI sites, respectively, used for the cloning in pMCT6.

Expression/Purification of Recombinant CFP21 Protein.

Expression and metal affinity purification of recombinant proteins was undertaken essentially as described by the manufacturers. For each protein, 1 l LB-media containing 100 μg/ml ampicillin, was inoculated with 10 ml of an overnight culture of XL1-Blue cells harbouring recombinant pMCT6 plasmids. Cultures were shaken at 37° C. until they reached a density of OD₆₀₀=0.4-0.6. IPTG was hereafter added to a final concentration of 1 mM and the cultures were further incubated 4-16 hours. Cells were harvested, resuspended in 1× sonication buffer+8 M urea and sonicated 5×30 sec. with 30 sec. pausing between the pulses.

After centrifugation, the lysate was applied to a column containing 25 ml of resuspended Talon resin (Clontech, Palo Alto, USA). The column was washed and eluted as described by the manufacturers.

After elution, all fractions (1.5 ml each) were subjected to analysis by SDS-PAGE using the Mighty Small (Hoefer Scientific Instruments, USA) system and the protein concentrations were estimated at 280 nm. Fractions containing recombinant protein were pooled and dialysed against 3 M urea in 10 mM Tris-HCl, pH 8.5. The dialysed protein was further purified by FPLC (Pharmacia, Sweden) using a 6 ml Resource-Q column, eluted with a linear 0-1 M gradient of NaCl. Fractions were analyzed by SDS-PAGE and protein concentrations were estimated at OD₂₈₀. Fractions containing protein were pooled and dialysed against 25 mM Hepes buffer, pH 8.5.

Finally the protein concentration and the LPS content were determined by the BCA (Pierce, Holland) and LAL (Endosafe, Charleston, USA) tests, respectively.

Example 1B Identification of RD1-ORF3

In an effort to control the threat of TB, attenuated bacillus Calmette-Guérin (BCG) has been used as a live attenuated vaccine. BCG is an attenuated derivative of a virulent Mycobacterium bovis. The original BCG from the Pasteur Institute in Paris, France was developed from 1908 to 1921 by 231 passages in liquid culture and has never been shown to revert to virulence in animals, indicating that the attenuating mutation(s) in BCG are stable deletions and/or multiple mutations which do not readily revert. While physiological differences between BCG and M. tuberculosis and M. bovis has been noted, the attenuating mutations which arose during serial passage of the original BCG strain has been unknown until recently. The first mutations described are the loss of the gene encoding MPB64 in some BCG strains (Li et al., 1993, Oettinger and Andersen, 1994) and the gene encoding ESAT-6 in all BCG strain tested (Harboe et al., 1996), later 3 large deletions in BCG have been identified (Mahairas et al., 1996). The region named RD1 includes the gene encoding ESAT-6 and an other (RD2) the gene encoding MPT64. Both antigens have been shown to have diagnostic potential and ESAT-6 has been shown to have properties as a vaccine candidate (cf. PCT/DK94/00273 and PCT/DK/00270). In order to find new M. tuberculosis specific diagnostic antigens as well as antigens for a new vaccine against TB, the RD1 region (17.499 bp) of M. tuberculosis H37Rv has been analyzed for Open Reading Frames (ORF). ORFs with a minimum length of 96 bp have been predicted using the algorithm described by Borodovsky and McIninch (1993), in total 27 ORFs have been predicted, 20 of these have possible diagnostic and/or vaccine potential, as they are deleted from all known BCG strains. The predicted ORFs include ESAT-6 (RD1-ORF7) and CFP10 (RD1-ORF6) described previously (Sørensen et al., 1995), as a positive control for the ability of the algorithm. In the present is described the potential of 1 of the predicted antigens for diagnosis of TB.

Identification of rd1-orf3.

The nucleotide sequence of rd1-orf3 from M. tuberculosis H37Rv is set forth in SEQ ID NO: 87. The deduced amino acid sequence of RD1-ORF2 is set forth in SEQ ID NO: 88.

The DNA sequence rd1-orf3 (SEQ ID NO: 87) contained an open reading frame starting with an ATG codon at position 2807-2809 and ending with a termination codon (TAA) at position 3101-3103 (position numbers referring to the location in RD1). The deduced amino acid sequence (SEQ ID NO: 88) contains 98 residues corresponding to a molecular weight of 9,799.

Cloning of rd1-orf3.

Rd1-orf3 was PCR cloned in the pMST24 (Theisen et al., 1995) expression vector. Chromosomal DNA from M. tuberculosis H37Rv was used as template in the PCR reactions. Oligonucleotides were synthesized on the basis of the nucleotide sequence from the RD1 region (Accession no. U34848). The oligonucleotide primers were engineered to include an restriction enzyme site at the 5′ end and at the 3′ end by which a later subcloning was possible. Primers are listed in table 2. rd1-orf3. A SmaI site was engineered immediately 5′ of the first codon of rd1-orf3, and a NcoI site was incorporated right after the stop codon at the 3′ end. The gene rd1-orf3 was subcloned in pMST24, giving pTO87.

The PCR fragments were digested with the suitable restriction enzymes, purified from an agarose gel and cloned into pMST24. The construct was used to transform the E. coli XL1-Blue. Endpoints of the gene fusions were determined by the dideoxy chain termination method. Both strands of the DNA were sequenced.

Purification of Recombinant RD1-ORF3.

The rRD1-ORF3 was fused N-terminally to the (His)₆-tag. Recombinant antigen was prepared as described in example 1a Purification of recombinant antigen by Ni²⁺ affinity chromatography was also carried out as described in example 1b. Fractions containing purified His-rRD1-ORF3 were pooled. The His-rRD1-ORF3 were extensively dialysed against 10 mM Tris/HCl, pH 8.5, 3 M urea followed by an additional purification step performed on an anion exchange column (Mono Q) using fast protein liquid chromatography (FPLC) (Pharmacia, Uppsala, Sweden). The purification was carried out in 10 mM Tris/HCl, pH 8.5, 3 M urea and protein was eluted by a linear gradient of NaCl from 0 to 1 M. Fractions containing the His-rRD1-ORF3 were pooled and subsequently dialysed extensively against 25 mM Hepes, pH 8.0 before use.

TABLE 2 Sequence of the rd1-orf3 oligonucleotides^(a). Orientation and Position oligonucleotide Sequences (5′→ 3′) (nt) Sense CTTCCCGGGATGGAAAAAATGTC 2807-2822 RD1-ORF3f AC (SEQ ID NO: 153) Antisense GATGCCATGGTTAGGCGAAGACGC 3103-3086 RD1-ORF3r CGGC (SEQ ID NO: 154) ^(a)The oligonucleotides were constructed from the Accession number U34484 nucleotide sequence (Mahairas et al., 1996). Nucleotides (nt) underlined are not contained in the nucleotide sequence of RD1-ORF's. The positions correspond to the nucleotide sequence of Accession number U34484.

The nucleotide sequences of rd1-orf3 from M. tuberculosis H37Rv are set forth in SEQ ID NO: 87. The deduced amino acid sequences of rd1-orf3 are set forth in SEQ ID NO: 88.

Example 1C Identification of CFP8A, CFP16 and CFP23 Identification of CFP16.

ST-CF was precipitated with ammonium sulphate at 80% saturation. The precipitated proteins were removed by centrifugation and after resuspension washed with 8 M urea. CHAPS and glycerol were added to a final concentration of 0.5% (w/v) and 5% (v/v) respectively and the protein solution was applied to a Rotofor isoelectrical Cell (BioRad). The Rotofor Cell had been equilibrated with a 8M urea buffer containing 0.5% (w/v) CHAPS, 5% (v/v) glycerol, 3% (v/v) Biolyt 3/5 and 1% (v/v) Biolyt 4/6 (BioRad). Isoelectric focusing was performed in a pH gradient from 3-6. The fractions were analyzed on silver-stained 10-20% SDS-PAGE. Fractions with similar band patterns were pooled and washed three times with PBS on a Centriprep concentrator (Amicon) with a 3 kDa cut off membrane to a final volume of 1-3 ml. An equal volume of SDS containing sample buffer was added and the protein solution boiled for 5 min before further separation on a Prep Cell (BioRad) in a matrix of 16% polyacrylamide under an electrical gradient. Fractions containing well separated bands in SDS-PAGE were selected for N-terminal sequencing after transfer to PVDF membrane.

Isolation of CFP8A

ST-CF was precipitated with ammonium sulphate at 80% saturation and redissolved in PBS, pH 7.4, and dialysed 3 times against 25 mM piperazin-HCl, pH 5.5, and subjected to chromatofocusing on a matrix of PBE 94 (Pharmacia) in a column connected to an FPLC system (Pharmacia). The column was equilibrated with 25 mM piperazin-HCl, pH 5.5, and the elution was performed with 10% PB74-HCl, pH 4.0 (Pharmacia). Fractions with similar band patterns were pooled and washed three times with PBS on a Centriprep concentrator (Amicon) with a 3 kDa cut off membrane to a final volume of 1-3 ml and separated on a Prepcell as described above.

N-Terminal Sequencing

Fractions containing CFP8A and CFP16 were blotted to PVDF membrane after Tricine SDS-PAGE (Ploug et al, 1989). The relevant bands were excised and subjected to N-terminal amino acid sequence analysis on a Procise 494 sequencer (Applied Biosystems). The fraction containing CFP25A was blotted to PVDF membrane after 2-DE PAGE (isoelectric focusing in the first dimension and Tricin SDS-PAGE in the second dimension). The relevant spot was excised and sequenced as described above.

The following N-terminal sequences were obtained:

CFP8A: DPVDDAFIAKLNTAG (SEQ ID NO: 155) CFP16: AKLSTDELLDAFKEM (SEQ ID NO: 156)

N-Terminal Homology Searching in the Sanger Database and Identification of the Corresponding Genes.

The N-terminal amino acid sequence from each of the proteins was used for a homology search using the blast program of the Sanger Mycobacterium tuberculosis database:

http://www.sanger.ac.uk/projects/m-tuberculosis/TB-blast-server.

For CFP8A and CFP16 the following information was obtained:

CFP8A: A sequence 80% identical to the 15 N-terminal amino acids was found on contig TB_(—)1884. The N-terminally determined sequence from the protein purified from culture filtrate starts at amino acid 32. This gives a length of the mature protein of 98 amino acids corresponding to a theoretical MW of 9700 Da and a pl of 3.72 This is in good agreement with the observed MW on SDS-PAGE at approximately 8 kDa. The full length protein has a theoretical MW of 12989 Da and a pl of 4.38.

CFP16: The 15 aa N-terminal sequence was found to be 100% identical to a sequence found on cosmid MTCY20H1.

The identity is found within an open reading frame of 130 amino acids length corresponding to a theoretical MW of CFP16 of 13440.4 Da and a pl of 4.59. The observed molecular weight in an SDS-PAGE gel is 16 kDa.

Use of Homology Searching in the EMBL Database for Identification of CFP23.

Homology searching in the EMBL database (using the GCG package of the Biobase, Århus-DK) with the amino acid sequences of two earlier identified highly immunoreactive ST-CF proteins, using the TFASTA algorithm, revealed that these proteins (CFP21 and CFP25) belong to a family of fungal cutinase homologs. Among the most homologous sequences were also two Mycobacterium tuberculosis sequences found on cosmid MTCY13E12. The first, MTCY13E12.04 has 46% and 50% identity to CFP25 and CFP21 respectively. The second, MTCY13E12.05, has also 46% and 50% identity to CFP25 and CFP21. The two proteins share 62.5% aa identity in a 184 residues overlap. On the basis of the high homology to the strong T-cell antigens CFP21 and CFP25, respectively, it is believed that CFP19A and CFP23 are possible new T-cell antigens.

The first reading frame encodes a 254 amino acid protein of which the first 26 aa constitute a putative leader peptide that strongly indicates an extracellular location of the protein. The mature protein is thus 228 aa in length corresponding to a theoretical MW of 23149.0 Da and a Pi of 5.80. The protein is named CFP23.

The second reading frame encodes an 231 aa protein of which the first 44 aa constitute a putative leader peptide that strongly indicates an extracellular location of the protein. The mature protein is thus 187 aa in length corresponding to a theoretical MW of 19020.3 Da and a Pi of 7.03. The protein is named CFP19A.

The presence of putative leader peptides in both proteins (and thereby their presence in the ST-CF) is confirmed by theoretical sequence analysis using the signalP program at the Expasy molecular Biology server

(http://expasy.hcuge.ch/www/tools.html).

Searching for Homologies to CFP16 and CFP23 in the EMBL Database.

The amino acid sequences derived from the translated genes of the individual antigens were used for homology searching in the EMBL and Genbank databases using the TFASTA algorithm, in order to find homologous proteins and to address eventual functional roles of the antigens.

CFP16: RpIL gene, 130 aa. Identical to the M. bovis 50s ribosomal protein L7/L12 (acc. No P37381).

CFP23: CFP23 has between 38% and 46% identity to several cutinases from different fungal sp.

In addition CFP23 has 46% identity and 61% similarity to CFP25 as well as 50% identity and 63% similarity to CFP21 (both proteins are earlier isolated from the ST-CF).

Cloning of the Genes Encoding CFP8A, CFP16 and CFP23

The genes encoding CFP8A, CFP16 and CFP23 were all cloned into the expression vector pMCT6, by PCR amplification with gene specific primers, for recombinant expression in E. coli of the proteins.

PCR reactions contained 10 ng of M. tuberculosis chromosomal DNA in 1× low salt Taq+ buffer from Stratagene supplemented with 250 mM of each of the four nucleotides (Boehringer Mannheim), 0.5 mg/ml BSA (IgG technology), 1% DMSO (Merck), 5 pmoles of each primer and 0.5 unit Tag+ DNA polymerase (Stratagene) in 10 ml reaction volume. Reactions were initially heated to 94° C. for 25 sec. and run for 30 cycles of the program; 94° C. for 10 sec., 55° C. for 10 sec. and 72° C. for 90 sec, using thermocycler equipment from Idaho Technology.

The DNA fragments were subsequently run on 1% agarose gels, the bands were excised and purified by Spin-X spin columns (Costar) and cloned into pBluescript SK II+-T vector (Stratagene). Plasmid DNA was hereafter prepared from clones harbouring the desired fragments, digested with suitable restriction enzymes and subcloned into the expression vector pMCT6 in frame with 8 histidines which are added to the N-terminal of the expressed proteins. The resulting clones were hereafter sequenced by use of the dideoxy chain termination method adapted for supercoiled DNA using the Sequenase DNA sequencing kit version 1.0 (United States Biochemical Corp., USA) and by cycle sequencing using the Dye Terminator system in combination with an automated gel reader (model 373A; Applied Biosystems) according to the instructions provided. Both strands of the DNA were sequenced.

For cloning of the individual antigens, the following gene specific primers were used:

CFP8A: Primers used for cloning of cfp8A: CFP8A-F: (SEQ ID NO: 157) CTGAGATCTATGAACCTACGGCGCC CFP8A-R: (SEQ ID NO: 158) CTCCCATGGTACCCTAGGACCCGGGCAGCCCCGGC

CFP8A-F and CFP8A-R create BglII and NcoI sites, respectively, used for the cloning in pMCT6.

CFP16: Primers used for cloning of cfp16: OPBR-104: CCGGGAGATCTATGGCAAAGCTCTCCACCGACG (SEQ ID NO: 159) OPBR-105: CGCTGGGCAGAGCTACTTGACGGTGACGGTGG (SEQ ID NO: 160)

OPBR-104 and OPBR-105 create BglII and NcoI sites, respectively, used for the cloning in pMCT6.

CFP23: Primers used for cloning of cfp23: OPBR-86: CCTTGGGAGATCTTTGGACCCCGGTTGC (SEQ ID NO: 161) OPBR-87: GACGAGATCTTATGGGCTTACTGAC (SEQ ID NO: 162)

OPBR-86 and OPBR-87 both create a BglII site used for the cloning in pMCT6.

Expression/Purification of Recombinant CFP8A, CFP16 and CFP23 Proteins.

Expression and metal affinity purification of recombinant proteins was undertaken essentially as described by the manufacturers. For each protein, 1 l LB-media containing 100 μg/ml ampicillin, was inoculated with 10 ml of an overnight culture of XL1-Blue cells harbouring recombinant pMCT6 plasmids. Cultures were shaken at 37° C. until they reached a density of OD₆₀₀=0.4-0.6. IPTG was hereafter added to a final concentration of 1 mM and the cultures were further incubated 4-16 hours. Cells were harvested, resuspended in 1× sonication buffer+8 M urea and sonicated 5×30 sec. with 30 sec. pausing between the pulses.

After centrifugation, the lysate was applied to a column containing 25 ml of resuspended Talon resin (Clontech, Palo Alto, USA). The column was washed and eluted as described by the manufacturers.

After elution, all fractions (1.5 ml each) were subjected to analysis by SDS-PAGE using the Mighty Small (Hoefer Scientific Instruments, USA) system and the protein concentrations were estimated at 280 nm. Fractions containing recombinant protein were pooled and dialysed against 3 M urea in 10 mM Tris-HCl, pH 8.5. The dialysed protein was further purified by FPLC (Pharmacia, Sweden) using a 6 ml Resource-Q column, eluted with a linear 0-1 M gradient of NaCl. Fractions were analyzed by SDS-PAGE and protein concentrations were estimated at OD₂₈₀. Fractions containing protein were pooled and dialysed against 25 mM Hepes buffer, pH 8.5.

Finally the protein concentration and the LPS content were determined by the BCA (Pierce, Holland) and LAL (Endosafe, Charleston, USA) tests, respectively.

Example 2

Species Distribution of cfp21, cfp23 and rd1-orf3

Presence of cfp21, cfp23 and rd1-orf3 in Different Mycobacterial Species.

The Southern blotting was carried out as described previously (Oettinger and Andersen, 1994) with the following modifications: 2 pg of genomic DNA was digested with PvuII, electrophoresed in an 0.8% agarose gel, and transferred onto a nylon membrane (Hybond N-plus; Amersham International plc, Little Chalfont, United Kingdom) with a vacuum transfer device (Milliblot, TM-v; Millipore Corp., Bedford, Mass.).

The cfp21, cfp23 and rd1-orf3 gene fragments were amplified by PCR from the recombinant pMCT6 plasmids encoding the individual genes. The primers used (same as the primers used for cloning) are described in example 1a and 1b. The results are summarized in Table 3.

TABLE 3 Interspecies analysis of the cfp21 and rd1-orf3 genes by Southern blotting. Species and strain cfp21 cfp23 rd1-orf3 1. M. tub. H37Rv + + + 2. M. bovis + + + 3. M. bovis BCG N.D. + − Danish 1331 4. M. bovis + + − BCG Japan 5. M. avium + + − 6. M. kansasii − + − 7. M. marinum + + − 8. M. scrofulaceum + + − 9. M. intercellulare + + − 10. M. fortuitum − + − 11. M. xenopi + + − 12. M. szulgai + + − +, positive reaction; −, no reaction, N.D. not determined.

Example 3 Total Extraction of Proteins from Dead M. tuberculosis Bacteria

1.5×10⁹ bacteria/ml M. tuberculosis was heat treated at 55° C. for 1.5 hours and checked for sterility. 10 ml of these heat killed bacteria was centrifuged at 2000 g for 40 min; the supernatant was discharged and the pellet resuspended in PBS containing 0.5% Tween 20 and used as the antigen source. The pellet was sonicated with 20 rounds of 90 seconds and centrifuged 30 min at 5000 g to remove unbroken cells. The supernatant containing soluble proteins as well as cell wall and cell membrane components was extracted twice with 10% SDS to release proteins inserted in the cell wall and membrane compartments. After a centrifugation at 20.000 g for 30 min the supernatant was precipitated with 8 volume of cold acetone and resuspended in PBS at a protein concentration of 5 mg/ml and named: Somatic Proteins Extract (SPE).

Example 3A Subcellular Fractionation of Mycobacterium tuberculosis

1.5×10⁹ colony forming units (CFU/ml) of M. tuberculosis H37Rv were inactivated by heat-killing at 60° C. for 1.5 hour. The heat-killed Mycobacteria was centrifuged at 3,000×g for 20 min; the supernatant was discarded and the pellet was resuspended in cold PBS. This step was repeated twice. After the final wash, the pellet was resuspended in a homogenizing buffer consisting of PBS supplemented with 10 mM EDTA and 1 mM of phenylmethylsulfonyl fluoride in a ratio of 1 ml buffer per 0.5 g of heat-killed Mycobacteria. The sample was sonicated on ice for 15 min (1-min-pulser-on/10-sec-pulser off) and subsequently lysed three times with a French Pressure Cell at 12,000 lb/in². The lysate was centrifuged at 27,000×g for 20 min; the pellet was washed in homogenizing buffer and recentrifuged. The pooled supernatants contained a mixture of cytosol and membrane components, while the pellet represented the crude cell wall.

Preparation of Cell Wall

The cell wall pellet, resuspended in homogenizing buffer, was added RNase and DNase to a final concentration of 1 mg/ml and incubated overnight at 4° C. The cell wall was washed twice in homogenizing buffer, twice in homogenizing buffer saturated with KCl, and twice with PBS. Soluble proteins were extracted from the cell wall by a 2 hour incubation with 2% SDS at 6° C. The insoluble cell wall core was removed by a centrifugation at 27,000×g for 20 min and the SDS-extraction was repeated. Finally, the pooled supernatants were precipitated with 6 volumes of chilled acetone and resuspended in PBS.

Preparation of Cytosol and Membrane:

To separate the cytosol and the membrane fraction, the pooled supernatants were ultracentrifugated at 100,000×g for 2 hours at 5° C. The cytosol proteins in the supernatant were precipitated with acetone and resuspended in PBS. The pellet, representing the membrane fraction, was washed in PBS, ultracentrifugated, and finally resuspended in PBS.

Triton X-114 Extraction of Cell Wall and Membrane:

To prepare protein fractions largely devoid of lipoarabinomannan, the cell wall and the membrane fraction were subjected to extraction with precondensed Triton X-114. Triton X-114 was added to the protein sample at a final concentration of 4%. The solution was mixed on ice for 60 min and centrifuged at 20,000×g for 15 min at 4° C. The pellet containing residual insoluble material was extracted once more (membrane) or twice (cell wall), while the supernatant was warmed to 37° C. to condense the Triton X-114. After centrifugation of the supernatant at 12,000×g for 15 min, the aqueous phase and detergent phase were separated. The aqueous phase and detergent phase were washed twice with Triton X-114 and PBS, respectively. The combined aqueous phases and residual insoluble material containing the majority of proteins were pooled, precipitated with acetone, and resuspended in PBS.

Example 4A Identification of Proteins from the Cytosolic Fraction

Use of Patient Sera to Identify M. tuberculosis Antigens

This example illustrates the identification of antigens from the cytosol fraction by screening with serum from M. tuberculosis infected individuals in western blot. The reaction with serum was used as an indication that the proteins are recognized immunologically.

Identification of Abundant Proteins

As immunity to tuberculosis is not B-cell but T-cell mediated, reactivity with serum from TB patients was not the only selection criterion used to identify proteins from the cytosol. Further proteins were selected by virtue of their abundance in the cytosol.

The cytosol was precipitated with ammonium sulphate at 80% saturation. The non-precipitated proteins were removed by centrifugation and precipitated proteins were resuspended in 20 mM imidazole, pH 7.0. The protein solution was applied to a DEAE Sepharose 6B column, equilibrated with 20 mM imidazole. Bound protein was eluted from the column using a salt gradient from 0 to 1 M NaCl, in 20 mM imidazole. Fractions collected during elution was analyzed on a silver stained 10-20% SDS-PAGE and on 2 dimensional electrophoresis. Fractions containing well separated bands were selected for 2D electrophoresis and blotted to PVDF, after which spots, visualised by staining with Coomassie Blue, were selected for N-terminal sequencing.

The following N-terminal sequence was obtained

For TB15A: S A Y K T V V V G T D D X S X (SEQ ID NO: 163)

No sequence identity was found, when searching the Sanger database using the blast program. However, when the blast program at Swiss-blast was used, a sequence was obtained.

TB15A

For the determined N-terminal sequence of TB15 a 78% identical sequence was found in CY01B2.28. The X at position 13 of the determined N-terminal sequence corresponds to a G in MTCY01B2.28 and the X at position 15 to a D.

Within the open reading frame the translated protein is 146 amino acids long. The N-terminal sequence of the protein identified in the cytosol starts at amino acid no 2, with the N-terminal Met cleaved off.

This gives a protein of 146 amino acids, which corresponds to a theoretical molecular mass of 15 313 Da and a theoretical pl of 5.60. The observed mass in SDS-PAGE is 16 kDa.

The highest sequence identity, 32% in a 34 amino acid overlap, was found to a conserved protein of Methanobacterium thermoautotrophicum.

Example 4B Identification of Proteins from the Cell Wall

Identification of TB16 TB32 and TB51.

Proteins contained in the cell wall fraction were separated by 2-D electrophoresis. A sample containing 120 mg protein was subjected to isoelectric focusing in a pH gradient from 4 to 7. The second dimension separation (SDS-PAGE) was carried out in a 10-20% acrylamide gradient. After blotting onto a PVDF membrane, proteins could be visualised by Coomassie blue staining.

N-Terminal Sequencing.

The relevant spots were excised from the PVDF membrane and subjected to N-terminal sequencing using a Procise sequencer (Applied Biosystems). The following N-terminal sequences were obtained:

TB16: ADKTTQTIYIDADPG (SEQ ID NO: 164) TB32: SGNSSLGIIVGIDD (SEQ ID NO: 165) TB51: MKSTVEQLSPTRVRI (SEQ ID NO: 166)

N-Terminal Sequence Identity Searching and Identification of the Corresponding Genes.

The N-terminal amino acid sequence from each of the proteins identified was used for a sequence identity search using the tblastn program at NCBI:

http://www.ncbi.nlm.nih.gov/cgi-bin/BLAST/nph-blast?Jform=0

The following information was obtained:

TB16:

The 15 aa N-terminal sequence was found to be 100% identical to a sequence found within the Mycobacterium tuberculosis sequence MTV021.

The identity is found within an open reading frame of 144 amino acids length corresponding to a theoretical molecular mass of 16294 Da and a pl of 4.64. The apparent molecular mass in an SDS-PAGE gel is 17 kDa.

The amino acid sequence shows some similarity to other hypothetical Mycobacterial proteins.

TB32:

The 14 aa N-terminal sequence was found to be 100% identical to a sequence found within the Mycobacterium tuberculosis sequence MTCY1A10.

The identity is found within an open reading frame of 297 amino acids length corresponding to a theoretical molecular mass of 31654 Da and a pl of 5.55. The apparent molecular mass in an SDS-PAGE gel is 33 kDa.

The amino acid sequence shows some similarity to other hypothetical Mycobacterial proteins.

TB51:

The 15 aa N-terminal sequence was found to be 100% identical to a sequence found within the Mycobacterium tuberculosis sequence MTV008.

The identity is found within an open reading frame of 466 amino acids length corresponding to a theoretical molecular mass of 50587 Da and a pl of 4.3. The apparent molecular mass in an SDS-PAGE gel is 56 kDa.

The amino acid sequence shows similarities to trigger factor from several organisms. Possible chaperone protein.

Example 4C Cloning of the Genes Encoding TB15A, TB16, TB32 and TB51

The genes encoding TB15A, TB16, TB32 and TB51 were all cloned into the E. coli expression vector pMCT3, by PCR amplification with gene specific primers.

Each PCR reaction contained 10 ng of M. tuberculosis chromosomal DNA in 1× low salt Taq+ buffer (Stratagene) supplemented with 250 □M of each of the four nucleotides (Boehringer Mannheim), 0.5 mg/ml BSA (IgG technology), 1% DMSO (Merck), 5 pmoles of each primer, and 0.5 unit Taq+ DNA polymerase (Stratagene) in 10 □l reaction volume. Reactions were initially heated to 94° C. for 25 sec. and run for 30 cycles according to the following program; 94° C. for 10 sec., 55° C. for 10 sec., and 72° C. for 90 sec., using thermocycler equipment from Idaho Technology.

The PCR fragment was ligated with TA cloning vector pCR® 2.1 (Invitrogen) and transformed into E. coli. Plasmid DNA was thereafter prepared from clones harbouring the desired fragment, digested with suitable restriction enzymes and subcloned into the expression vector pMCT3 in frame with 6 histidine residues which are added to the N-terminal of the expressed proteins. The resulting clones were hereafter sequenced by cycle sequencing using the Dye Terminator system in combination with an automated gel reader (model 373A; Applied Biosystems) according to the instructions provided. Both strands of the DNA were sequenced.

Expression and metal affinity purification of recombinant proteins was undertaken essentially as described by the manufacturers. For each protein, 1 l LB-media containing 100 μg/ml ampicillin, was inoculated with 10 ml of an overnight culture of XL1-Blue cells harbouring recombinant pMCT3 plasmids. Cultures were shaken at 37° C. until they reached a density of OD₆₀₀=0.4-0.6. IPTG was hereafter added to a final concentration of 1 mM and the cultures were further incubated 4-16 hours. Cells were harvested, resuspended in 1× sonication buffer+8 M urea and sonicated 5×30 sec. with 30 sec. pausing between the pulses.

After centrifugation, the lysate was applied to a column containing 10 ml of resuspended Talon resin (Clontec, Palo Alto, USA). The column was washed and eluted as described by the manufacturers.

After elution, all fractions (1.5 ml each) were subjected to analysis by SDS-PAGE using the Mighty Small (Hoefer Scientific Instruments, USA) system and the protein concentrations were estimated at OD₂₈₀ nm. Fractions containing recombinant protein were pooled and dialysed against 3 M urea in 10 mM Tris-HCl, pH 8.5. The dialysed protein was further purified by FPLC (Pharmacia, Sweden) using 1 ml HiTrap columns (Pharmacia, Sweden) eluted with a linear salt gradient from 0-1 M NaCl. Fractions were analysed by SDS-PAGE and protein concentrations were estimated at OD₂₈₀ nm. Fractions containing protein were pooled and dialysed against 25 mM Hepes buffer, pH 8.5.

Finally, the protein concentration and the LPS content were determined by the BCA (Pierce, Holland) and LAL (Endosafe, Charleston, USA) tests, respectively.

For cloning of the individual proteins, the following gene specific primers were used:

TB15A: Primers used for cloning of TB15A: TB15A-F: (SEQ ID NO: 167) CTG CCA TGG CTA GGT GGT GTG CAC GAT C TB15A-R: (SEQ ID NO: 168) CTG AAG CTT ATG AGC GCC TAT AAG ACC

TB15-F and TB15-R create NcoI and HindIII sites, respectively, used for the cloning in pMCT3.

TB16: Primers used for cloning OF TB16: TB16-F: (SEQ ID NO: 169) CTG AGA TCT GCG GAC AAG ACG ACA CAG TB16-R: (SEQ ID NO: 170) CTC CCA TGG TAC CGG AAT CAC TCA GCC

TB16-F and TB16-R create BG/II and NcoI sites, respectively, used for the cloning in pMCT3.

TB32: Primers used for cloning of TB32: TB32-F: (SEQ ID NO: 171) CTG AGA TCT ATG TCA TCG GGC AAT TCA TB32-R: (SEQ ID NO: 172) CTC CCA TGG CTAC CTA AGT CAG CGA CTC GCG

TB32-F and TB32-R create BG/II and NcoI sites, respectively, used for the cloning in pMCT3.

TB51: Primers used for cloning of TB51: TB51-F: (SEQ ID NO: 173) CTG AGA TCT GTG AAG AGC ACC GTC GAG TB51-R: (SEQ ID NO: 174) CTC CCA TGG GTC ATA CGG TCA CGT TGT

TB51-F and TB51-R create BG/II and NcoI sites, respectively, used for the cloning in pMCT3.

Example 5 Evaluation of Immunological Activity of Identified Somatic Proteins The Use of Polypeptides as Diagnostic Reagents:

A polypeptide has diagnostic potential in humans when it is inducing significantly higher responses in patients with microscopy or culture positive tuberculosis compared to PPD positive or PPD negative individuals with no known history of TB infection or exposure to M. tuberculosis but who may or may not have received a prior BCG vaccination, have been exposed to non-tuberculosis mycobacteria (NTM), or be actively infected with M. avium. To identify polypeptides capable of discriminating between the above mentioned groups, the level of response and the frequency of positive responders to the polypeptide is compared. By positive responders are meant i) reactivity by human serum or plasma from TB patients with the polypeptide using conventional antibody ELISA/Western blot or ii) in vivo delayed type hypersensitivity response to the polypeptide which is at least 5 mm higher than the response induced by a control material.

The diagnostic potential of polypeptides will initially be evaluated in 10 individuals with TB infection and 10 individuals with no known exposure to virulent Mycobacteria. High specificity, >80% will be the most important selection criteria for these polypeptides and a sensitivity >80% is desirable but sensitivity >30% is acceptable as combinations of several specific antigens may be preferred in a cocktail of diagnostic reagent recognized by different individuals.

Skin Test Reaction in TB Infected Guinea Pigs

To identify polypeptides as antigens with the potential as TB diagnostic reagents the ability of the proteins to induce a skin test response will be evaluated in the guinea pig model where groups of guinea pigs have been infected with either M. tuberculosis or M. avium or vaccinated with BCG.

To evaluate the response in M. tuberculosis infected guinea pigs, female outbred guinea pigs will be infected via an ear vein with 1×10⁴ CFU of M. tuberculosis H37Rv in 0.2 ml of PBS or aerosol infected (in an exposure chamber of a Middlebrook Aerosol Generation device) with 1×10⁵ CFU/ml of M. tuberculosis Erdman given rise to 10-15 granulomas per animal in the lung. After 4 weeks skin test will be performed with the polypeptides diluted in 0.1 ml of PBS and 24 hours after the injection reaction diameter is measured.

To evaluate the response in M. avium infected guinea pigs, female outbred guinea pigs will be infected intradermally with 2×10⁶ CFU of a clinical isolate of M. avium (Atyp. 1443; Statens Serum Institut, Denmark). Skin test are performed 4 weeks after with the polypeptides diluted in 0.1 ml of PBS and 24 hours after the injection reaction diameter is measured.

To evaluate the response in BCG vaccinated guinea pigs, female outbred guinea pigs will be sensitized intradermally with 2×10⁶ CFU of BCG (BCG Danish 1331; Statens Serum Institut). Skin test are performed 4 weeks after with the polypeptides diluted in 0.1 ml of PBS and 24 hours after the injection reaction diameter is measured.

If a polypeptide induces a significant reaction in animal infected with M. tuberculosis but not in BCG vaccinated guinea pigs this polypeptide may have a potential as a diagnostic reagent to differentiate between BCG vaccinated and M. tuberculosis infected individuals, which will hereafter be evaluated in the human population.

If a polypeptide induces a reaction in M. tuberculosis infected guinea pigs but not in guinea pigs infected with M. avium, this polypeptide may have a potential as a diagnostic reagent with respect to differentiate between an individual infected with M. tuberculosis and an individual infected with Mycobacteria not belonging to the tuberculosis complex. The polypeptide may also have a potential as a diagnostic reagent to differentiate between a M. avium and a M. tuberculosis infected individual.

Example 6A Serological Recognition of the Recombinant Produced Proteins

To test the potential of the proteins as serological antigens sera was collected from 8 TB patients and 4 healthy BCG non-vaccinated controls and were assayed for antibodies recognizing the recombinantly produced proteins in an ELISA assay as follows: Each of the sera was absorbed with Promega E. coli extract (S37761) for 4 hours at room temperature and the supernatants collected after centrifugation. 0.5 ug of the proteins in Carbonatbuffer at pH 9.6 were absorbed over night at 5° C. to a polystyrene plate (Maxisorp, Nunc). The plates were washed in PBS-0.05% Tween-20 and the sera applied in a dilution of 1:100. After 1 hour of incubation the plates were washed 3 times with PBS-0.05% Tween-20 and 100 ul per well of peroxidase-conjugated Rabbit Anti-Human IgA, IgG, IgM was applied in a dilution of 1:8000. After 1 hour of incubation the plates were washed 3 times with PBS-0.05% Tween-20. 100 ul of substrate (TMB PLUS, Kem-En-Tec) was added per well and the reaction stopped after 30 min with 0.2 M Sulphuric acid and the absorbance was read at 405 nm. The results are shown in table 4.

TABLE 4 Serological recognition of the proteins by TB patients (n = 8) and healthy controls (n = 4). The percentage of responders as well as the number of persons responding in each group is indicated. The cut-off values for positive responses are OD 0.2 for CFP8A, CFP16, CFP23 and RD1-ORF3, OD 0.25 for CFP21, TB15A and TB16 and OD 0.3 for TB51. Percent (n) Percent (n) positive Protein responders healthy controls CFP8A 63 (5) 0 (0) CFP16 50 (4) 0 (0) CFP21 80 (6) 0 (0) CFP23 50 (4) 0 (0) RD1-ORF3 25 (2) 0 (0) TB15A 25 (2) 0 (0) TB16 100 (8) 0 (0) TB51 13 (1) 0 (0)

As shown in table 4 all the proteins are recognized by at least 13% of the tested TB patients. CFP8A, CFP16 and CFP21 are recognized by 50% or more of the TB patients tested and most extraordinary all the tested TB patients recognized TB16. In addition, CFP8A, CFP16, CFP21, CFP23, RD1-ORF3, TB15A, TB16 and TB51 were recognized with a very high OD (>0.5) by some of the TB patients indicating a particular high amount of specific antibodies to these proteins. None of the proteins are recognized by healthy non-BCG vaccinated controls, which demonstrates the potential of these proteins to differentiate between M. tuberculosis infected individuals and healthy individuals. All these proteins are therefore excellent diagnostic candidates.

Example 6B Serological Recognition of Single Recombinant Produced Proteins and Mixtures of the Recombinant Produced Proteins

To evaluate the potential of 39 recombinantly produced proteins as serological antigens were sera collected from 42 TB patients and 32 healthy controls and assayed for antibodies recognizing the recombinantly produced proteins in an ELISA assay as follows: Each of the sera was absorbed with Promega E. coli extract (S3761) for 4 hours at room temperature and the supernatants collected after centrifugation. 0.5 ug of the proteins in Carbonatbuffer at pH 9.6 were absorbed over night at 5° C. to a polystyrene plate (Maxisorp, Nunc). The plates were washed in PBS-0.05% Tween-20 and the sera applied in a dilution of 1:100. After 1 hour of incubation the plates were washed 3 times with PBS-0.05% Tween-20 and 100 ul per well of peroxidase-conjugated Rabbit Anti-Human IgA, IgG, IgM was applied in a dilution of 1:8000. After 1 hour of incubation the plates were washed 3 times with PBS-0.05% Tween-20. 100 ul of substrate (TMB PLUS, Kem-En-Tec) was added per well and the reaction stopped after 30 min with 0.2 M Sulphuric acid and the absorbance was read at 405 nm.

The results were evaluated for all the 39 tested proteins and on the basis of these results 7 antigens were selected for their superior abilities as serological antigens as shown in Table 5. For comparison has the result for the well known serological antigen 38 kDa also been shown in table 5.

TABLE 5 Serological recognition of the proteins by TB patients (n = 42) and healthy controls (n = 32). The number of responders as well as the calculated sensitivity and specificity is indicated for each antigen. Cut-off is defined as MeanControl + 3 SD for the individual antigen. TB patients Healthy controls Posi- High Posi- High Sensi- Speci- Protein tive responders* tive responders* tivity ficity CFP8a 11 2 1 0 26% 96.9% TB15A 7 2 1 0 17% 96.9% CFP16 10 6 1 0 24% 96.9% TB16 23 9 0 0 55% 100% CFP21 13 3 1 0 31% 96.9% CFP23^(a) 12 3 1 0 23% 97% TB32 9 2 1 0 21% 96.9% TB51 14 5 0 0 33% 100% RD1- 6 5 0 0 14% 100% ORF3 38 kDa 6 2 0 0 14% 100% *High responders defined as OD values > MeanControl + 6 SDControl for each individual antigen. ^(a)53 TB patients and 33 healthy controls were assayed for antibodies recognizing CFP23

For a diagnostic reagent for TB it is crucial to have a high specificity in order not to obtain false positive results which may lead to anti-TB treatment of healthy people. We therefore selected the serological antigens on the criteria of either the ability to induce a high specificity (more than 90%) combined with high sensitivity or the ability to enhance the sensitivity of a protein cocktail when combined with other antigens without compromising the high specificity. Also included in table 5 is the 38 kDa antigen which is well documented antigen and is believed to be one of the most promising serological proteins (Cole, R. A., et al 1996). As shown in table 5 the 38 kDa antigen has a sensitivity of 14% in the tested patient group and all the selected antigens shown in table 5 performs similar or with a higher sensitivity that the 38 kDa antigen without compromising the specificity (all selected antigen have a specificity more than 96%). In particular are TB16 and TB51 outstanding with a sensitivity of respectively 55% and 33% and a specificity of 100%. Also important is the fact that all these selected antigens induces a very high response in two or more donors which demonstrates their potency as diagnostic reagents.

For a diagnostic TB reagent it is important to achieve a very high sensitivity and as demonstrated in table 6 this be achieved by combining the antigens identified above. In practice this can be accomplished either by mixing the antigens in the same well in the ELISA plate or by combining the results from multiple wells incubated with the same blood sample. Alternatively the proteins of interests can be produced as recombinant fusions proteins comprising of at least two proteins or B cell epitopes and the resulting fusion molecule can hereafter can used in the serological assays.

The antibody response of tuberculosis is heterogeneous with considerable person-to-person variance to which antigens that are recognized by the antibodies (Lyashcenko, K. et al 1998) and therefore, can it be an advantage to use combinations of proteins (e.g. in protein cocktails) which may increase the sensitivity and be recognized by sera from a high proportion of infected individuals.

TABLE 6 Calculated sensitivity (sens.) and specificity (spec.) of selected antigen combinations # Antigens Sens. Spec. 2 TB16 + TB51 62% 100% 2 TB15A + TB16 64% 97% 2 TB16 + CFP21 67% 97% 3 TB15A + TB16 + TB51 71% 97% 3 CFP16 + TB16 + CFP21 71% 94% 3 TB16 + CFP21 + TB51 74% 97% 3 TB15A + TB16 + CFP21 74% 94% 4 CFP16 + CFP17 + 64% 94% CFP21 + TB51 4 CFP8A + CFP16 + 76% 94% TB16 + CFP21 4 CFP16 + TB16 + 79% 94% CFP21 + TB51

For the combinations shown in table 6 it is advantageous to combine from two to four antigens which will give a higher sensitivity than the single antigen and still a high specificity (more than 90%). In particular is the combination of CFP16+TB16+CFP21+TB51 and TB16+CFP21+TB51 and TB15A+TB16+CFP21 very efficient in this study population. The combinations shown in table 6 are only examples and other useful combinations can be envisaged as up to eight antigens may be combined and lead to increased sensitivity. In addition, can other antigens be combined with the above defined proteins for example the 38 kDa antigen which may be combined with any of the above described antigens and may increase the sensitivity. In this respect it is of importance that it has been observed that different populations react to different antigens (Julian, E. et al 2000, Lyashcenko, K. et al 1998) and it may therefore be necessary to define individual combinations for different populations. Therefore, combinations which does not give high sensitivity in the tested study population may be very efficient as diagnostic reagents when tested in another population.

LIST OF REFERENCES

-   Andersen et al. (1993) J. Immunol. Methods 161: 29-39. -   Andersen P. et al., 1995, J. Immunol. 154: 3359-72 -   Andersen P., 1994, Infect. Immun. 62: 2536-44. -   Andersen, P. and Heron, I, 1993, J. Immunol. Methods 161: 29-39. -   Andersen, Å. B. et al., 1992, Infect. Immun. 60: 2317-2323. -   Barkholt, V. and Jensen, A. L., 1989, Anal. Biochem. 177: 318-322. -   Boesen et al (1995). Infection and Immunity 63:1491-1497 -   Borodovsky, M., and J. McIninch. 1993, Computers Chem. 17: 123-133. -   Chang, C. D et al (1978) Nature, 375:515 -   Cole, R. A., et al 1996, Tuberc. Lung Dis. 77:363-368 -   Flesch, I. and S. H. E. Kaufmann (1987) J. Immunol. 138(12):4408-13. -   Goeddel et al., (1979) Nature 281:544 -   Gosselin et al., 1992, J. Immunol. 149: 3477-3481. -   Harboe, M. et al., 1996, Infect. Immun. 64: 16-22. -   Hochstrasser, D. F. et al., 1988, Anal. Biochem. 173: 424-435 -   Hopp and Woods (1981) Proc Natl Acad Sci USA. 78(6):3824-8. -   Itakura et al., (1977) Science 198:1056 -   Jameson and Wolf, (1988) Comput Appl Biosci, 4(1):181-6 -   Julian, E., et al 2000, Int J Tuberc Lung Dis 4(11):1082-1085. -   Kyte and Doolittle, (1982) J Mol Biol, 157(1):105-32 -   Köhler, G. and Milstein, C., 1975, Nature 256: 495-497. -   Li, H. et al., 1993, Infect. Immun. 61: 1730-1734. -   Lindblad E. B. et al., 1997, Infect. Immun. 65: 623-629. -   Lyashcenko, K., et al 1998, Infection and Immunity 66(8):3936-3940. -   Mahairas, G. G. et al., 1996, J. Bacteriol 178: 1274-1282. -   Maniatis T. et al., 1989, “Molecular cloning: a laboratory manual”,     2nd ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. -   Nagai, S. et al., 1991, Infect. Immun. 59: 372-382. -   Oettinger, T. and Andersen, Å. B., 1994, Infect. Immun. 62:     2058-2064. -   Ohara, N. et al., 1995, Scand. J. immunol. 41: 233-442. -   Pal P. G. and Horwitz M. A., 1992, Infect. Immun. 60: 4781-92. -   Pearson, W. R. and Lipman D. J., 1988. Proc. Natl. Acad. Sci. USA     85: 2444-2448. -   Ploug, M. et al., 1989, Anal. Biochem. 181: 33-39. -   Porath, J. et al., 1985, FEBS Lett. 185: 306-310. -   Roberts, A. D. et al., 1995, Immunol. 85: 502-508. -   Rook, G. A. W. (1990) Res. Microbiol. 141:253-256. -   Siebwenlist et al., (1980) Cell, 20: 269 -   Sørensen, A. L. et al., 1995, Infect. Immun. 63: 1710-1717. -   Theisen, M. et al., 1995, Clinical and Diagnostic Laboratory     Immunology, 2: 30-34. -   Ulmer, J. B. et al., (1993) Curr. Opin. Invest. Drugs, 2:983-989 -   Valdés-Stauber, N. and Scherer, S., 1994, Appl. Environ. Microbiol.     60: 3809-3814. -   Valdés-Stauber, N. and Scherer, S., 1996, Appl. Environ. Microbiol.     62: 1283-1286. -   van Dyke M. W. et al., 1992. Gene pp. 99-104. -   von Heijne, G., 1984, J. Mol. Biol. 173: 243-251. -   Williams, N., 1996, Science 272: 27. -   Young, R. A. et al., 1985, Proc. Natl. Acad. Sci. USA 82: 2583-2587. 

1-20. (canceled)
 21. A nucleic acid fragment in isolated form which 1) comprises a nucleic acid sequence which encodes a polypeptide which a) comprises an amino acid sequence as shown in SEQ ID NO: 2, b) comprises a subsequence of the polypeptide fragment defined in a) which has a length of at least 6 amino acid residues, said subsequence being immunologically equivalent to the polypeptide defined in a) with respect to the ability of evoking a protective immune response against infections with mycobacteria belonging to the tuberculosis complex or with respect to the ability of eliciting a diagnostically significant immune response indicating previous or ongoing sensitization with antigens derived from mycobacteria belonging to the tuberculosis complex, or c) comprises an amino acid sequence having a sequence identity with the polypeptide defined in a) or the subsequence defined in b) of at least 70% and at the same time being immunologically equivalent to the polypeptide defined in a) with respect to the ability of evoking a protective immune response against infections with mycobacteria belonging to the tuberculosis complex or with respect to the ability of eliciting a diagnostically significant immune response indicating previous or ongoing sensitization with antigens derived from mycobacteria belonging to the tuberculosis complex; or comprises a nucleic acid sequence complementary thereto, 2) has a length of at least 10 nucleotides and hybridizes readily under stringent hybridization conditions with a nucleic acid fragment which has a nucleotide sequence as defined in SEQ ID NO: 1 or a sequence complementary thereto.
 22. A nucleic acid fragment according to claim 21, wherein said polypeptide fragment comprises an epitope for a T-helper cell.
 23. A nucleic acid fragment according to claim 21, wherein the polypeptide fragment is free from any signal sequence.
 24. A nucleic acid fragment according to claim 21, wherein said polypeptide fragment 1) induces a release of IFN-γ from primed memory T-lymphocytes withdrawn from a mouse within 2 weeks of primary infection or within 4 days after the mouse has been re-challenge infected with mycobacteria belonging to the tuberculosis complex, the induction performed by the addition of the polypeptide to a suspension comprising about 200.000 spleen cells per ml, the addition of the polypeptide resulting in a concentration of 1-4 μg polypeptide per ml suspension, the release of IFN-γ being assessable by determination of IFN-γ in supernatant harvested 2 days after the addition of the polypeptide to the suspension, and/or 2) induces a release of IFN-γ of at least 300 pg above background level from about 1000,000 human PBMC (peripheral blood mononuclear cells) per ml isolated from TB patients in the first phase of infection, or from healthy BCG vaccinated donors, or from healthy contacts to TB patients, the induction being performed by the addition of the polypeptide to a suspension comprising the about 1,000,000 PBMC per ml, the addition of the polypeptide resulting in a concentration of 1-4 μg polypeptide per ml suspension, the release of IFN-γ being assessable by determination of IFN-γ in supernatant harvested 2 days after the addition of the polypeptide to the suspension; and/or 3) induces an IFN-γ release from bovine PBMC derived from animals previously sensitized with mycobacteria belonging to the tuberculosis complex, said release being at least two times the release observed from bovine PBMC derived from animals not previously sensitized with mycobacteria belonging to the tuberculosis complex.
 25. A nucleic acid fragment according to claim 21, wherein the sequence identity in c) is at least 80%, such as least 85%, at least 90%, at least 92%, at least 94%, at least 96%, and at least 98%. It is most preferred that the sequence identity is 100%.
 26. A nucleic acid fragment according to claim 21, wherein said polypeptide fragment is a fusion polypeptide comprising at least one polypeptide fragment as defined in claim 1 and at least one fusion partner.
 27. A nucleic acid fragment according to claim 26 wherein the fusion partner is selected from the group consisting of: (a) a polypeptide fragment derived from a virulent mycobacterium, such as ESAT-6, MPB64, MPT64, TB10.4, CFP10, RD1-ORF5, RD1, ORF2, Rv1036, Ag85A, Ag85B, Ag85C, 19 KDa lipoprotein, MPT32, MPB59 and alpha-crystallin; (b) a polypeptide according to claim 1, and (c) at least one immunogenic portion, e.g. a T-cell epitope, of any of the polypeptides in (a) or (b).
 28. A nucleic acid fragment according to claim 21, which is a DNA fragment.
 29. A nucleic acid fragment according to claim 21, which has a length of at least 15 nucleotides, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, at least 55, at least 60, at least 65, at least 70 or at least 80 nucleotides.
 30. A nucleic acid fragment according to claim 21, which is more than 70% identical with a nucleic acid fragment which has a nucleotide sequence as defined in SEQ ID NO:
 1. 31. A vaccine comprising a nucleic acid fragment according to claim 21, the vaccine effecting in vivo expression of antigen by an animal, including a human being, to whom the vaccine has been administered, the amount of expressed antigen being effective to confer substantially increased resistance to infections with mycobacteria of the tuberculosis complex in an animal, including a human being.
 32. A vaccine for immunizing an animal, including a human being, against tuberculosis caused by mycobacteria belonging to the tuberculosis complex, comprising as the effective component a non-pathogenic microorganism, wherein at least one copy of a DNA fragment comprising a DNA sequence encoding a polypeptide as defined in claim 21 has been incorporated into the genome of the microorganism in a manner allowing the microorganism to express and optionally secrete the polypeptide.
 33. A vaccine according to claim 32, wherein the microorganism is a bacterium.
 34. A vaccine according to claim 33, wherein the bacterium is selected from the group consisting of the genera Mycobacterium, Salmonella, Pseudomonas and Eschericia.
 35. A vaccine according to claim 34, wherein the microorganism is Mycobacterium bovis BCG, 25 such as Mycobacterium bovis BCG strain: Danish
 1331. 36. A vaccine according to claim 35, wherein at least 2 copies of an isolated DNA fragment comprising a nucleic acid sequence which encodes a polypeptide which comprises a member selected from the group consisting of: a) an amino acid sequence as shown in SEQ ID NO: 2, b) a subsequence of the polypeptide fragment defined in a) which has a length of at least 6 amino acid residues, said subsequence being immunologically equivalent to the polypeptide defined in a) with respect to the ability of evoking a protective immune response against infections with mycobacteria belonging to the tuberculosis complex or with respect to the ability of eliciting a diagnostically significant immune response indicating previous or ongoing sensitization with antigens derived from mycobacteria belonging to the tuberculosis complex, or c) an amino acid sequence having a sequence identity with the polypeptide defined in a) or the subsequence defined in b) of at least 70% and at the same time being immunologically equivalent to the polypeptide defined in a) with respect to the ability of evoking a protective immune response against infections with mycobacteria belonging to the tuberculosis complex or with respect to the ability of eliciting a diagnostically significant immune response indicating previous or ongoing sensitization with antigens derived from mycobacteria belonging to the tuberculosis complex; or comprises a nucleic acid sequence complementary thereto, wherein said DNA fragment has a length of at least 10 nucleotides and hybridizes readily under stringent hybridization conditions with a nucleic acid fragment which has a nucleotide sequence as defined in SEQ ID NO: 1 or a sequence complementary thereto.
 37. A vaccine according to claim 36, wherein the number of copies is at least
 5. 38. A replicable expression vector which comprises a nucleic acid fragment according claim
 21. 39. A vector according to claim 38, which is an autonomously replicating vector.
 40. A vector according to claim 39, which is selected from the group consisting of a virus, a bacteriophage, a plasmid, a cosmid and a micro chromosome.
 41. A vector according to claim 38, which is able to become integrated into the genome of the host cell.
 42. A vector according to claim 38, which is a DNA segment encoding a reporter gene product useful for identification of heterologous gene products and/or a resistance gene such as an antibiotic resistance gene.
 43. A transformed cell harboring at least one vector according to claim
 38. 44. A transformed cell according to claim 43, which is a bacterium belonging to the tuberculosis complex, such as a M. tuberculosis bovis BCG cell.
 45. A transformed cell according to claim 43, which expresses a polypeptide as defined in claim
 1. 46. A method for producing a polypeptide as defined In claim 21, comprising inserting a nucleic acid fragment according to claim 21 into a vector which is able to replicate in a host cell, introducing the resulting recombinant vector into the host cell, culturing the host cell in a culture medium under conditions sufficient to effect expression of the polypeptide, and recovering the polypeptide from the host cell or culture medium; or isolating the polypeptide from a short-term culture filtrate as defined in claim 21; or isolating the polypeptide from whole mycobacteria of the tuberculosis complex or from lysates or fractions thereof, e.g. cell wall containing fractions; or synthesizing the polypeptide by solid or liquid phase peptide synthesis.
 47. A method for immunising an animal, including a human being, against tuberculosis caused by mycobacteria belonging to the tuberculosis complex, comprising administering to the animal the vaccine according to claim
 33. 48. A method according to claim 47, wherein the vaccine is administered by the parenteral (such as intravenous and intraarterially), intraperitoneal, intramuscular, subcutaneous, intradermal, oral, buccal, sublingual, nasal, rectal or transdermal route.
 49. A composition for diagnosing tuberculosis in an animal, including a human being, comprising a nucleic acid fragment according to claim 21, optionally in combination with a means for detection.
 50. A method for determining the presence of mycobacterial nucleic acids in an animal, including a human being, or in a sample, comprising administering a nucleic acid fragment according to claim 21 to the animal or incubating the sample with the nucleic acid fragment, and detecting the presence of hybridized nucleic acids resulting from the incubation.
 51. A method for stimulating in an animal an immunogenic response against bacteria belonging to the tuberculosis complex which comprises administering thereto, alone and/or together with an adjuvant, a polypeptide having at least 90% sequence identity with the sequence set forth in SEQ ID NO: 2, wherein said immune response causes in said animal an increased resistance against infections with bacteria belonging to the tuberculosis complex.
 52. A method for effecting in vivo expression of an antigen by an animal comprising administering thereto the nucleic acid sequence according to claim 21, where in said expression confers resistance to infections with mycobacteria of the tuberculosis complex. 